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1.  QUICK  SUMMARY  OF  RESEARCH  ACCOMPLISHMENTS 

1.  A  new  fast  algorithm  for  solving  Toeplitz-plus-Heinkel  systems  of  equations.  The  new 
algorithm  appears  to  be  33%  faster  than  the  previous  approach  of  reformulating  the 
problem  as  a  block-Toeplitz  system  of  equations. 

2.  A  new  fast  algorithm  for  solving  i/oci-ToepUtz-plus-Hankel  systems  of  equations.  This 
algorithm  is  useial  for  lineeir  prediction  for  two-dimensional  random  fields  defined  on 
a  discrete  polar  raster.  The  covariance  must  be  a  Toeplitz-plus-Hankel  function  of 
both  the  radial  and  aiigular  arguments;  an  isotropic  random  field  has  this  property. 

3.  A  fast  algorithm  for  linear  prediction  for  three-dimensional  random  fields  defined  on 
a  spherical  raster.  The  covariance  must  be  a  Toeplitz-plus-Hankel  function  of  radius 
and  of  the  two  angular  arguments;  a  time-varying  random  field  that  is  wide-sense 
stationary  in  time  has  this  property. 

4.  A  discrete  form  of  the  Bellman-Siegert-Krein  resolvent  identity,  which  can  be  used  to 
compute  smoothing  filters  from  the  prediction  filters  computed  using  the  algorithms 
in  #2  eind  #3  above.  This  generalizes  a  one- dimensional  (1-D)  continuous-parameter 
result  of  Kailath  to;  (1)  the  discrete  ca.se;  and  (2)  two  dimensions  (2-D). 

5.  Two  new  algorithms  for  estimating  a  structured  Toeplitz-plus-Hankel  covariance  func¬ 
tion  from  time  series  data  in  1-D  or  2-D.  The  estimated  covariances  have  the  structure 
required  by  the  algorithms  in  #2  and  ^3  above. 

6.  The  two-dimensional  linear  prediction  problem  on  a  2-D  polar  raster.  Includes:  (1) 
two  new  algorithms  for  spectral  estimation,  using  Radon  transforms  to  map  the  2-D 
problem  into  1-D  problems;  (2)  interpolating  functions  to  compute  Radon  trEinsforms; 
and  (3)  positive-definite  covariance  extension  and  correlation  matching. 

7.  Some  proposed  VLSI  implementations  of  the  1-D  and  2-D  algorithms  described  above. 
The  similarity  of  these  algorithms  to  finite-difference  equations  allows  VLSI  for  finite- 
difference  equations  to  be  adapted  to  these  algorithms,  with  some  changes. 

8.  Demonstrations  of  the  new  algorithms  applied  to  the  problems  of:  (1)  linear  predictive 
coding  of  images  defined  on  a  polar  raster-  nnd  (‘’I  smoothing  and  ^estGration  of 
images.  Such  images  arise  in  tomography  and  spotlight  synthetic  aperture  radar. 


1 


2.  QUICK  REVIEW  OF  LINEAR  PREDICTION  FAST  ALGORITHMS 


Linear  least-squares  estimation  has  played  an  important  and  useful  role  in  modern 
signal  processing.  It  has  been  applied  to  problems  in  one- dimensional  prediction  and 
estimation  with  considerable  success.  In  roughly  the  last  decade,  similar  success  has  been 
achieved  for  multidimensional  estimation  and  smoothing  problems. 

In  order  to  place  the  results  of  this  report  in  proper  perspective,  it  is  worthwhile  to 
briefly  review  some  fast  algorithms  used  in  linear  least  squares  estimation.  More  details 
on  this  material  are  available  in  Section  2  of  Appendix  A. 

2.1  One-Dimensional  Levinson,  Schur,  and  Split  Algorithms 

In  the  one-dimensional  case,  for  a  wide-sense  stationary  random  process,  the  linear 
prediction  problem  can  be  solved  efficiently  using  the  celebrated  Levinson  algorithm  [1]. 
This  algorithm  utilizes  the  Toeplitz  structure  of  the  covariance  matrix  to  reduce  the  number 
of  multiplications  required  to  solve  the  Nth  order  prediction  problem  from  the  O(JV^) 
required  by  Gaussian  elimination  to  O(N^).  The  Levinson  algorithm  recursively  computes 
the  prediction  filters  in  increasing  order.  In  the  process,  it  generates  a  set  of  reflection 
coefficients  that  constitute  an  alternative  pzurametrization  of  the  prediction  filters. 

In  the  Levinson  algorithm,  the  reflection  coefficients  must  be  computed  using  an  “in¬ 
ner  product”  expression  (equation  (2-lb)  of  Appendix  A),  which  accounts  for  roughly 
one-third  of  the  computation  in  the  £dgorithm.  More  importantly,  this  computation  is 
not  parallelizable.  The  “inner  product”  computation  cam  be  avoided  by  using  the  Schur 
algorithm  [2]  to  compute  the  reflection  coeffici  mts  directly  from  the  covariance  of  the  ran¬ 
dom  process.  Thus  a  more  efficient  procedure  for  computing  the  linear  prediction  filters 
is  to  nm  the  Schtir  algorithm  in  parallel  with  the  Levinson  algorithm,  using  the  reflection 
coefficients  computed  by  the  Schur  algorithm  in  the  Levinson  algorithm  [3]. 

Recently  Delsarte  and  Genin  [4]  noted  that  a  redundzmcy  exists  in  the  lattice  com¬ 
putations  in  the  Levinson  and  Schur  algorithms.  By  replacing  the  lattice  recursions  with 
a  single  three-term  recurrence,  half  of  the  multiplications  in  the  lattice  rectirsions  are 
avoided.  This  results  in  the  split  Levinson  and  Schur  algorithms,  which  are  obviously  more 
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efficient  implementations  of  the  classical  Levinson  aijd  Schiir  algorithms. 

In  the  split  £ilgorithms,  the  reflection  coefficients  parametrizing  the  prediction  filters 
are  repljiced  by  “potentials”  that  also  parametrize  these  filters.  The  split  Schur  algorithm 
can  be  used  to  compute  these  potentials  from  the  covariance  function;  the  potentials  are 
then  inserted  into  a  split  Levinson  algorithms  running  in  parallel.  More  importantly,  the 
split  eilgorithms  are  the  axis  along  which  the  one- dimensional  Levinson  and  Schur  algo¬ 
rithms  can  be  extended  to  higher  dimensions,  and  to  more  general  covariance  structures. 

2.2  Two-Dimensional  Levinson  and  Schur  Algorithms 

There  have  been  several  efforts  to  generalize  the  Levinson  and  Schur  algorithms  to 
two  dimensions,  in  order  to  simplify  the  solution  of  the  two  dimensional  linear  prediction 
problem.  We  quickly  summarize  these  here;  for  more  details  see  Appendices  A  and  C. 

The  usual  approach  is  to  assume  that  a  two-dimensioned  random  field  is  [5]-[7]:  (1) 
defined  on  a  rectangular  array  of  points;  (2)  stationary;  (3)  has  quarter-plane  or  asymmet¬ 
ric  half-plane  causality,  i.e.,  the  linear  prediction  filter  for  the  random  field  should  have 
quarter-plane  or  asymmetric  half-plane  support.  Then  the  two-dimensional  linear  predic¬ 
tion  problem  can  be  formulated  as  a  multichannel  one- dimensional  problem,  and  solved 
using  the  multichannel  Levinson  and  Schur  algorithms  [8]. 

The  multichannel  Levinson  and  Schur  edgorithms  are  essentially  matrix  versions  of 
the  one-dimensioned  algorithms,  md  they  exploit  the  Toeplitz-block-Toeplitz  structure  of 
the  covariance  matrix  to  similarly  reduce  the  number  of  multiplications  needed  to  solve 
the  two-dimensional  discrete  Wiener-Hopf  (or  Yule- Walker)  equations.  There  axe  several 
variations  on  this  theme,  but  all  essentially  reformulate  the  two-dimensional  problem  on  a 
rectmgular  lattice  as  a  midtichannel  one-dimensional  problem  of  some  kind. 
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3.  RESEARCH  OBJECTIVES 


The  goals  of  this  project  were  as  follows; 

1.  To  develop  two-dimensional  versions  of  the  Levinson  and  Schur  algorithms  that  relax 
the  causality  requirements  of  existing  two-dimensional  algorithms,  and  replace  them 
with  causality  2issumptions  that  are  more  physically  reasonable.  These  algorithms 
should  also  not  require  stationarity  of  the  random  field,  but  allow  a  more  general 
structure  in  the  covariance  function; 

2.  To  develop  algorithms  for  the  smoothing  problem,  as  opposed  to  the  prediction  prob¬ 
lem,  for  random  fields.  Since  random  fields  are  in  general  not  causally  generated, 
the  use  of  the  prediction  filters  computed  using  the  algorithms  in  is  limited  to 
linear  predictive  coding  of  the  random  field.  Linear  least  square?  filters  suitable  for 
reducing  noise  and  restoration  should  be  smoothing  filters  that  use  all  the  noisy  data 
to  estimate  the  random  field  at  any  point; 

3.  To  develop  t/ircc-dimensional  versions  of  the  algorithms  in  items  #1  and  #2,  suitable 
for  three-dimensional  random  fields.  Such  random  fields  describe  random  processes 
defined  over  space,  e.g.,  temperature,  images  varying  in  time,  etc.; 

4.  To  successfully  implement  these  algorithms,  study  their  niunerical  behavior,  Eind  apply 
them  to  some  problems  in  image  restoration,  smoothing,  and  linear  predictive  coding. 
All  four  goals  have  been  successfully  accomplished,  as  this  report  will  demonstrate. 

In  addition,  we  have  accomplished  the  following  additional  goals: 

5.  To  develop  algorithms  for  estimating  from  1-D  and  2-D  time  series  data  covariances 
with  the  structure  required  by  the  above  algorithms; 

6.  To  study  the  two-dimensional  linear  prediction  problem  on  a  polar  raster,  and  develop 
two-dimensional  spectral  estimation  algorithms  that  ensure  non-negative  spectral  es¬ 
timates; 

7.  To  develop  possible  VLSI  implementations  of  the  generalized  Levinson  and  Schur 
algorithms. 
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4.  RESEARCH  ACCOMPUSHMENTS 


This  section  contains  a  concise  summary  of  our  reseaich  results.  TechnicaJ  details  are 
provided  in  the  Appendices,  as  noted  below.  Part  I  includes  Sections  4. 1-4.5,  and  covers 
development  of  fast  algorithms  for  determining  optimal  filters  for  least-squ^Lres  estimation 
of  random  fields.  Part  II  includes  Sections  4.6-4. 9,  and  covers  estimation  of  structured 
covariances  from  1-D  ^uld  2-D  time  series  data,  spectreil  estimation  on  a  polar  raster  from 
2-D  time  series  data,  applications,  and  VLSI  implementations. 

All  of  the  results  presented  below  are  new  contributions  to  the  field  of  linear  prediction. 

Part  I:  Fast  Algorithms  Tor  Optimal  Filters 

4.1  Continuous-Parameter  Results 

Our  original  proposal  was  formulated  in  continuous-parameter  space,  since  our  pre¬ 
liminary  results  w*jic  all  continuous-parameter  algorithms.  Specifically,  the  goal  was  to 
develop  fast  algorithms  for  solving  the  multi-dimensionaJ  Wiener-Hopf  integral  equation 

k{x,y)  =  h{x,y)+  h{x,z)k{z,y)dz,  |y|  <  |ar|,  i, y  €  R”, n  =  1, 2, 3 

The  solution  /i(i,  y)  of  this  integral  equation  is  the  optimal  lineeu:  least-squares  filter  for 
computing  the  estimate  s(i)  of  a  zero-mean  random  field  with  covarieince  fc(z,y)  from 
noisy  observations  {u){z)  —  s(z)  -|-  v(2),  \z\  <  |i|},  where  v{z)  is  zero-mean  white  noise. 

In  [9]  and  [10]  we  noted  for  n  =  3  that  if  the  covariance  fimction  k{x,y)  satisfies 
(Ar  —  Ay)k{x,y)  =  0,  then  the  prediction  filter  h{x,y)  satisfies  the  differential  form 


(Ar  -  Ay)/i(z,y)  =  J^V(x,e)h{\x\e,y)\x\'^de;  V(z,e)  =  “ 1^1^) 


where  5  is  the  unit  sphere  and  e  is  a  unit  vector  in  R^. 

The  derivation  of  this  equation  and  its  implications  are  discussed  extensively  in  Ap¬ 
pendix  A.  Here  we  merely  note  some  significant  facts; 

1.  The  differential  form  can  be  viewed  as  some  sort  of  three-dimensional,  continuous- 
ptirameter  generalization  of  the  split  Levirson  recurrence.  V’(z,e)  is  a  similar  gener¬ 
alization  of  the  potentisJ  in  the  split  algorithms; 
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2.  A  similar  differential  form,  initialized  using  the  covarieince  function  k{x,y),  can  be 
used  to  compute  V(x,e)  from  Jfc(i,y).  This  can  be  viewed  as  a  three-dimensional, 
continuous-parameter  generalization  of  the  split  Schur  recurrence; 

3.  The  differential  form  can  be  propagated  recursively  in  increasing  prediction  “order” 
|x|,  for  «ill  |y|  <  |x|.  In  this  way,  it  is  possible  to  solve  the  three-dimensional  Wiener- 
Hopf  equation  recursively  in  increasing  |xj; 

4.  The  structure  (A*  —  Ay)k(x,y)  =  0  required  m  the  covariance  function  can  be 
viewed  as  a  generalization  of  the  block- Toeplitz  structure  reqiiired  by  previous  two- 
dimensional  Levinson  algorithms.  However,  it  is  much  more  general:  note  that 
isotropic  (k(x,y)  =  /:((x  —  y|))  and  homogeneous  (k(x,y)  =  k(x  —  y))  random  fields 
are  included  as  special  cases  of  this  structure; 

5.  The  causality  assumed  in  the  random  field  prediction  filters  h{x,y)  is  simply  that 
h(x,y)  =  0  for  (y(  >  (x(,  i.e.,  causality  is  defined  simply  in  terms  of  radius.  This  is 
more  reasonable  physically  than  quarter-plane  causEility,  lexicographic  ordering,  etc. 
The  above  differential  forms  generate  the  prediction  filter  for  the  random  field.  How¬ 
ever,  for  estimation,  noise  reduction,  and  image  restoration,  the  smoothing  filter  which 
uses  all  noisy  observations  (including  [y]  >  jx])  is  desirable.  In  the  one-dimensional  case, 
Kailath  [10]  has  shown  that  the  smoothing  filter  can  be  easily  obtained  from  the  pre¬ 
diction  filter  using  the  Bellmsin-Siegert-Krein  resolvent  identity.  For  our  purposes,  this 
is  simply  a  differential  equation  relating  the  smoothing  and  prediction  filters.  A  three- 
dimensional  generalization  of  the  result  of  [’’  0],  applicable  to  the  smoothing  problem  for 
three-dimensional  random  fields,  is  derived  in  Appendix  A,  which  consists  of  the  follow¬ 
ing  paper;  A.E.  Yagle,  “Analogues  of  Split  Levinson,  Schur,  and  Lattice  Algorithms  for 
Three-Dimensional  Random  Field  Estimation  Problems,”  SIAM  J.  Appl.  Math.,  vol.  50, 
no.  6,  pp.  1780-1799,  Dec.  1990. 

A  major  part  of  this  project  has  fociised  on  deriving  inherently  discrete  versions  or 
counterparts  to  the  above  continuous  zdgorithms,  and  one  and  two  dimensional  versions  of 
the  above  three-dimensional  algorithms.  There  are  several  reasons  for  this: 

1.  Since  the  data  to  be  proces-^ed  is  most  likely  sampled  or  discrete  in  natme,  the  actual 
problem  of  interest  is  the  discrete  version; 
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2.  Any  continuous  sJgorithms  must  ultimately  be  discretized  before  they  can  be  imple¬ 
mented  on  a  computer;  however,  discretization  errors  will  be  eliminated  if  an  inher¬ 
ently  discrete  version  of  the  algorithms,  applicable  to  discrete  problems,  cEin  be  found; 

3.  The  one- dimensioned  and  two-dimensional  cases  eire  important  in  their  own  right.  The 
two-dimensional  case  is  particularly  important,  due  to  image  processing  applications; 

4.  Our  initial  attempts  along  these  lines  qinckly  met  with  success  (see  below). 

4.2  One-Dimensional  Toeplitz-Plus-Hankel  Systems 

The  one- dimensioned  discrete  version  of  this  algorithm  is  quite  interesting  in  its  own 
right.  It  is  a  generalization  of  the  split  Levinson  and  Schur  algorithms  that  solves  Toeplitz- 
plus-Hankel  systems  of  equations  (i.e  ,  systems  of  equations  in  which  the  system  matrix  is 
the  sum  of  an  arbitrary  Toeplitz  matrix  and  an  arbitrary  Hankel  matrix).  This  algorithm 
requires  only  half  eis  many  midtiplications  as  a  previous  algorithm  (llj  for  such  systems  of 
equations. 

Toeplitz-plus-Hankel  systems  of  equations  eirise  in  linear-phase  prediction  filter  design, 
the  Hildebrand-Prony  spectral  line  estimation  procedure,  FADE  approximation,  and  at¬ 
mospheric  scattering,  in  addition  to  the  nonstationary  process  linear  prediction  application 
motivating  this  algorithm  here. 

The  heart  of  the  algorithm  is  a  /our- term  recurrence  that  uses  two  potentieds,  as 
compared  to  the  usual  split  algorithm  recurrence  that  is  a  three- term  recurrence  using  a 
single  potential.  Since  a  Toeplitz-plus-Hankel  system  has  twice  as  many  degrees  of  freedom 
as  the  purely  Toeplitz  system  solved  by  the  usual  split  algorithms,  this  is  reasonable,  and 
it  seems  to  be  efficient. 

Details  are  given  in  Appendix  B,  which  consists  of  the  paper:  A.E.  Yagle,  “New 
Analogues  of  Split  Algorithms  for  Arbitrary  Toeplitz-plus-Hankel  Matrices,”  to  appear 
in  IEEE  Trans.  Signal  Processing,  vol.  ASSP-39,  no.  11,  Nov.  1991.  These  include 
application  to  Toeplitz-plus-Hankel  normal  and  Yule- Walker  equations,  arbitrary  Toeplitz- 
plus-Hankel  systems  of  equations,  and  simplifications  to  the  classical  split  algorithms  [4] 
for  ptirely  Toeplitz  systems. 
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4.3  Two-Dimensional  Block-Toeplitz-Plus-Hankel  Systems 

The  two-dimensional  discrete  version  of  this  algorithm  solves  block  Toeplitz-plus- 
Hankel  systems  of  equatijns.  This  algorithm  is  useful  for  linear  prediction  for  two- 
dimensional  random  fields  defined  on  a  discrete  polar  raster.  The  covariance  must  be 
a  Toeplitz-plus  Hankel  function  of  both  the  radial  and  angular  arguments;  the  important 
case  of  an  isotropic  random  field  has  this  property. 

Random  fields  defined  on  a  discrete  polar  raster  arise  in  tomography  and  spotlight 
synthetic  apev^ure  radar.  Although  such  data  could  be  interpolated  onto  a  rectangular 
lattice,  this  is  necessarily  inexact;  it  also  affects  the  covariance  function.  For  example, 
the  covariance  function  for  an  isotropic  random  field  on  a  rectangular  lattice  is  a  Toeplitz 
function  of  botl.  the  abscissae  and  the  ordinates,  leading  to  a  Toeplitz-block- Toeplitz  co- 
variance  matrix  in  the  two-dimensional  discrete  Wiener-Hopf  equation.  The  multichannel 
Levinson  algorithm  cem  be  used  on  this  system. 

However,  the  covariance  function  for  an  isotropic  random  field  on  a  polar  raster  is  a 
Toeplitz-plus-Hankel  function  of  the  radii  and  a  Toeplitz  function  of  the  angular  arguments, 
leading  to  a  block  Toeplitz-plus-Hankel  covariance  matrix  in  the  two-dimensional  discrete 
Wiener-Hopf  equation.  The  multichannel  Levinson  algorithm  cannot  be  used  to  solve  this 
problem-only  the  new  algorithm  of  this  section  is  applicable. 

Remarkably,  the  bjisic  recurrence  for  this  algorithm  is  essentially  a  discrete  version  of 
the  continuous-parameter  differential  form,  with  the  Lapl2icieLns  becoming  discrete  Lapla- 
ciaiiS  and  the  integral  becoming  a  siun.  This  is  remarkable  since  the  explicitly  discrete 
algorithm  is  an  exact  solution  to  the  discrete  problem,  rather  than  just  a  discretized  form 
of  the  continuous  algorithm.  In  the  continuous  limit,  the  discrete  tugorithm  approaches 
the  continuous  differential  form,  as  expected. 

Details  are  given  in  Appendix  C,  which  consists  of  the  pr  per:  W.-H.  Fang  and  A.E.  Ya- 
gle,  “Discrete  Fast  Algorithms  for  Two-Dimensional  Lineiir  Prediction  on  a  Polar  Raster,” 
to  appeax  in  IEEE  Trans.  Signal  Processing,  vol.  ASSP-40,  no.  6,  Jime  1992.  This 
includes  a  discussion  of  application  to  isotropic  and  other  random  fields,  details  of  the 
reduction  to  the  cx)ntinuous  case,  and  resulting  simplifications. 
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4.4  Three-Dimensional  Block-Toeplitz-Plus-Hankel  Systems 

The  three-dimensional  discrete  version  of  this  algorithm  solves  the  linear  prediction 
problem  for  three-dimensional  random  fields  defined  on  a  spherical  raster.  The  covariance 
must  be  a  Toeplitz-plus-Hankel  function  of  radius  and  of  the  two  angular  arguments;  a 
time- varying  random  field  that  is  wide-sense  stationary  in  time  has  this  property. 

This  result  is  a  direct  extension  of  the  2-D  algorithm.  For  a  siimmziry  and  derivation 
of  this  algorithm,  see  Appendix  D. 

4.5  One  and  Two-Dimensional  Discrete  Bellman-Siegert-Krein  (BSK)  Re¬ 
solvent  Identities 

Kailath  [10]  has  note  the  applicability  of  the  BSK  resolvent  identity  to  computing 
one- dimensional  smoothing  filters  from  prediction  filters.  We  have  developed  a  discrete 
version  of  Kailath’s  result,  and  numerically  implemented  it.  We  have  eJso  developed  a  two- 
dimensioned  discrete  version  of  the  BSK  relating  the  prediction  filters  for  two-dimensional 
random  fields  on  a  polar  raster  to  the  smoothing  filters  for  such  random  fields.  The  two- 
dimensional  discrete  algorithm  has  also  been  successfully  implemented  numerically. 

The  significance  of  this  result  is  noted  in  #3.6  below,  in  which  the  improvement  in 
using  smoothing  filters  instead  of  prediction  filters  is  demonstrated  on  several  ex^unples. 
For  a  polzLT  raster  with  N  points  along  each  of  N  radial  directions,  the  number  of  multi¬ 
plications  needed  to  compute  the  smoothing  filter  is  reduced  from  0(iV®)  using  Gausssiein 
elimination  to  0{N*),  if  the  algorithm  in  #3.4  is  used  to  compute  the  prediction  filters 
and  the  discrete  BSK  algorithm  is  then  used  to  compute  the  smoothing  filters. 

De*  ulc  axe  given  in  Appendix  E,  which  consists  of  the  paper:  W.-H.  Fzmg  and  A.E. 
Yagle,  “Fast  Algorithms  for  Lineax  Least-Squares  Smoothing  Problems  in  One  amd  Two 
Dimensions  using  Generalized  Discrete  Bellmain-Siegert-Krein  Resolvent  Identities,”  to  ap¬ 
pear  in  IEEE  Trans.  Signal  Processing,  vol.  ASSP-40,  no.  6,  Jime  1992.  It  includes  details 
of  the  reduction  to  the  continuous  case,  atnd  resulting  simplifications.  The  continuous  case 
is  treated  in  [10]  for  the  one-dimensional  case,  aind  in  Section  5  of  Appendix  A  for  the 
three-dimensionad  caise. 
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Part  II:  Algorithms  for  Covariance  and  Spectral  Estimation 
4.6  Structured  Estimation  of  Covariances 


This  second  part  covers  research  into  estimating  an  unknown  covariance,  with  the 
(block)  Toeplitz-plus-Hankel  structure  required  by  the  above  algorithms,  from  1-D  or  2-D 
time  series  data.  In  this  section  we  discuss  covariance  estimation;  in  the  next,  spectral 
estimation. 

There  has  been  much  work  on  this  problem  for  estimating  stationary  covariance  func¬ 
tions  from  data.  A  common  procedure  is  to  estimate  autocorrelation  lags  from  the  data, 
form  a  covariance  matrix,  and  then  “Toeplitzify”  it  by  averaging  along  the  diagonals  of 
the  covariance  matrix.  This  procedure  projects  (defined  from  the  Hilbert- Schmidt  inner 
product)  the  data  lag  matrix  onto  the  subspace  of  symmetric  Toeplitz  matrices. 

We  have  extended  this  approach.  We  have  derived  an  algorithm  that  projects  the 
data  lag  matrix  on  the  subspace  of  symmetric  Toeplitz-plus-Hankel  matrices.  This  sub¬ 
space  is  computed  using  a  Gram-Schmidt  orthonormalizatiou.  The  procedure  finds  the 
closest  (Hilbert-Schmidt  norm)  symmetric  Toeplitz-plus-Hankel  matrix  to  the  given  data 
lag  matrix.  Unfortimately,  this  procedure  is  more  complicated  than  simply  averaging 
along  diagonals.  The  Toeplitz  projection  is  found  this  way;  however,  the  Hankel  part  of 
the  projection  requires  weighted  sums  of  some  data  lag  matrix  elements. 

Due  to  the  complexity  of  this  algorithm,  we  have  developed  a  second  algorithm  that 
truly  generalizes  the  “Toeplitzation”  of  averaging  along  diagonals  into  a  “Toeplitz-plus- 
Hankelization”  of  averaging  along  diagonals  and  antidiagonals.  The  resulting  estimated 
Toeplitz-plus-Hzmkel  matrix  has  slightly  more  structure  than  required,  but  the  algorithm  is 
much  simpler  than  the  first  algorithm.  In  addition,  constraints  such  as  positive  definiteness 
and  rank  constraints  can  be  incorporated  into  a  slightly  different  but  equivalent  form  of 
this  sdgorithm.  Finally,  a  two-dimensional  version  of  this  latter  algorithm  has  also  been 
derived. 

Details  are  given  in  Appendix  F,  which  consists  of  the  paper:  W.-H.  Fang  and  A.E. 
Yagle,  “Two  Methods  for  Toeplitz-plus-Hankel  Approximation  to  a  Data  Covariance  Ma¬ 
trix,”  to  appear  in  IEEE  Trans.  Signal  Processing,  vol.  ASSP-40,  no.  6,  Jime  1992. 
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4.7  2-D  Spectral  Estimation  on  a  Polar  Raster 

We  consider  the  following  spectral  estimation  problem.  A  zero-mean  homogeneous 
random  field  is  defined  on  a  polar  raster.  Given  discrete  sample  values  inside  a  disk  of 
finite  radius,  estimate  the  field’s  power  spectral  density  using  a  linear  prediction  model. 

Issues  arising  here  include;  (1)  estimation  of  covariance  lags;  (2)  extendibility  of 
a  finite  set  of  lag  estimates  into  a  positive  semi-definite  covariance  extension  (required 
for  a  meaningful  spectral  density);  and  (3)  in  the  lack  of  performing  such  ^m  extension, 
guaranteeing  a  non-negative  spectral  density. 

Recall  that  the  covariance  extension  property  does  not  hold  on  a  rectangular  raster. 
However,  we  give  a  generalized  autocorrelation  procedure  that  guEU’suitees  a  positive  semi- 
definite  coArariance  extension.  It  first  interpolates  the  data  using  Gaussians,  computes 
its  Radon  transform,  and  then  applies  one-dimensional  spectral  estimation  techniques  to 
each  shce.  We  show  that  if  each  1-D  set  of  covariance  lags  is  positive  semi-definite,  then 
the  extended  covarismce  is  also  positive  semi-definite,  so  that  the  2-D  spectral  estimate  is 
non-negative  and  hence  meaningful. 

The  correlation  matching  property  that  the  extended  covariance  lags  should  match 
the  given  covariemce  lags  holds  in  the  Radon  domain,  but  not  in  the  spatial  domain.  We 
aJso  propose  a  second  algorithm  that:  (1)  matches  the  given  covariance  lags;  and  (2) 
gives  a  positive  semi-definite  extension  of  them,  when  this  is  possible.  We  also  discuss 
circumstances  when  this  is  impossible,  shedding  some  light  on  2-D  covariance  extension. 

Details  are  provided  in  Appendix  G,  which  consists  of  the  paper:  W.-H.  Fang  and  A.E. 
Yagle,  “Two-Dimensional  Linear  Prediction  and  Spectral  Estimation  on  a  Polar  Raster,” 
submitted  to  IEEE  TVans.  Signal  Processing. 

4.8  VLSI  Implementations  of  Fast  Algorithms 

The  generalized  Levinson  and  Schur  algorithms  in  Part  I  €u-e  amenable  to  parallel 
implementation.  The  similarity  of  their  recursions  to  finite  difference  equations  suggests 
that  VLSI  implementations  for  finite  differences  might  be  applied  to  these  algorithms.  This 
turns  out  to  be  the  case,  although  some  changes  are  required,  and  certain  special  cases 
allow  simpler  implementations. 
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Details  are  provided  in  Appendix  H,  which  consists  of  the  paper:  W.-H.  Fang  and 
A.E.  Yagle,  “A  Systolic  Architecture  for  New  Split  Algorithms  for  Arbitrary  Toeplitz- 
plus-Hankel  Matrices,”  submitted  to  IEEE  Trans.  Signal  Processing. 

4.9  Linear  Predictive  Coding  and  Smoothing  of  Random  Fields 

The  two-dimensional  discrete  algorithm  for  r£mdom  fields  on  a  polar  raster  has  been 
applied  to  linear  predictive  coding  of  isotropic  random  fields  on  a  polar  raster.  One  appli¬ 
cation  is  in  storing  images  defined  on  a  polar  raster  (e.g.,  tomographic  data  and  spotlight 
synthetic  aperture  radar  data)-storing  the  residuals,  instead  of  the  original  image,  requires 
much  fewer  bits. 

The  results  of  using  the  algorithm  are  compared  with  the  much  simpler  procedure 
of  using  linear  predictive  coding  independently  ailong  each  radial  slice;  this  amounts  to 
assuming  each  radial  slice  of  the  image  is  independent  of  each  other  slice.  This  is  of  course 
not  true  for  an  isotropic  random  field,  and  oxir  results  show  the  significant  improvement 
in  image  compression  ratio  using  the  two-dimensional  algorithm. 

The  two-dimensional  algorithm  is  also  applied  to  smoothing  isotropic  random  fields, 
in  order  to  reduce  noise.  This  has  obvious  applications  in  any  setting  in  which  the  data 
consists  of  noisy  observations  of  a  random  field.  First  the  prediction  filter  alone  is  used 
to  estimate  the  random  field  (this  is  anedogous  to  using  previous  two-dimensional  least- 
squares  filters  derived  using  quarter- plane  causality  on  a  rectangulaur  lattice).  Then  the 
smoothing  filter,  derived  using  the  two-dimensional  discrete  BSK  equation,  is  employed. 

The  results  show  considerable  improvement  (about  8  db)  in  signal-to-noise  ratio  when 
the  smoothing  filter  is  used,  and  about  1  db  improvement  when  the  prediction  filter  is  used 
alone.  This  demonstrates  the  importance  of  the  BSK  equation-the  smoothing  filters  are 
indeed  necessary. 

The  results  are  given  in  more  detail  in  Appendix  I. 
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ABSTRACT 

Fast  algorithms  for  computing  the  linear  least-squares  estimate  of  a  three-dimension2il 
random  field  from  noisy  observations  inside  a  sphere  are  proposed.  The  algorithms  can  be 
viewed  as  three-dimensional  analogues  of  the  spht  Levinson,  Schur,  zmd  lattice  algorithms 
of  linear  prediction,  since  they  exploit  an  (assumed)  Toepiitz-plus-Hankel  structure  of 
the  double  Radon  transform  of  the  random  field  covariance.  Therefore  these  algorithms 
require  fewer  computations  than  would  solution  of  the  three-dimensional  Wiener-Hopf 
integral  equation.  Unlike  previous  generalized  Levinson  algorithms,  no  quarter-plane  or 
asymmetric  half-plane  support  eissumptions  for  the  filter  are  necessary;  nor  is  the  three- 
dimensional  filtering  problem  treated  as  a  multichannel  (vector)  filtering  problem. 

The  algorithms  work  in  three  stages.  First,  the  three-dimensional  split  Schur  algo¬ 
rithm  computes  a  potential  from  the  covariance  of  the  random  field.  This  potential  is  a 
three-dimensional  analogue  of  the  parameter  appearing  in  the  split  Levinson  algorithm. 
Alternatively,  the  three-dimensional  split  lattice  algorithm  may  be  used  to  compute  the 
potential  from  the  camoniczd  spectral  factor  of  the  covariance  of  the  observation  field. 
Next,  the  three-dimensional  split  Levinson  algorithm  computes  the  Radon  transform  of 
the  three-dimensional  prediction  filter  for  estimating  the  random  field  on  the  surface  of 
the  sphere  of  noisy  observations.  Finally,  this  filter  is  used  to  compute  the  smoothing 
filter  for  estimating  the  random  field  inside  the  sphere  of  observations.  The  algorithms 
generalize  known  results  for  isotropic,  two-dimensional  random  fields. 
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1.  Introduction.  The  problem  of  computing  linear  least-squares  estimates  of  three- 
dimensional  random  fields  from  noisy  observations  is  important  in  such  fields  as  meteorol¬ 
ogy  and  processing  of  time  varying  images.  The  enormous  amount  of  computation  involved 
in  three-dimensional  signal  processing  requires  fast  algorithms  that  exploit  any  structure 
in  the  problem,  and  that  can  be  parallelized.  The  obvious  choices  of  fast  algorithms  for 
computing  estimates  from  covariance  information  are  three-dimensional  generalizations  of 
the  one-dimensicnal  Levinson,  Schur,  and  lattice  algorithms. 

Considerable  effort  has  been  applied  to  generalizing  the  Levinson  algorithm  to  two 
dimensions.  Although  many  useful  algorithms  have  been  obtained,  all  of  them  require  some 
assumptions  about  the  ^Uer,  i.e.,  the  order  in  which  the  data  are  processed,  as  opposed 
to  the  random  field  itself.  The  filters  constructed  from  existing  two-dimensional  Levinson 
algorithms  are  required  to  have  quau-ter-plane  support,  or  asymmetric  half-plane  support, 
or  some  other  such  condition,  due  to  the  necessity  of  imposing  some  well-defined  processing 
order  on  two-dimensional  data.  Another  approach  is  to  assume  hne-by-line  scanning, 
so  that  the  two-dimensional  estimation  problem  can  be  reformulated  as  a  multichannel 
one-dimensional  problem,  to  which  the  multichemnel  Levinson  algorithm  can  be  applied. 
Although  these  assumptions  are  appropriate  for  some  image  processing  problems,  they  axe 
inappropriate  for  the  general  estimation  problem.  Also,  extending  these  conditions  to  the 
three-dlmension2d  problem  is  not  trivial. 

In  this  paper  we  tedce  a  different  approach.  Following  [1]  and  [2]  we  operate  di¬ 
rectly  on  the  three-dimensional  Wiener- Hopf  integrsd  equation,  converting  it  into  a  three- 
dimensional  differential  form.  A  Radon  transform  converts  this  form  into  a  coupled  system 
of  partial  differential  equations  that  can  be  oropagated,  reconstructing  the  Radon  trams- 
form  of  the  solution  to  the  integral  equation.  Alternatively,  the  differential  form  can  be 
propagated  directly,  without  resort  to  the  Radon  transform.  The  coupled  system  of  equa¬ 
tions  can  be  viewed  as  a  three-dimensional,  continuous-parameter  analogue  of  the  split 
Levinson  algorithm  of  linear  prediction  [3]. 
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The  potential  required  to  propagate  these  equations  is  obtained  from  a  three-dimensional 
analogue  of  the  split  Sckur  algorithm  [3].  The  split  Schur  algorithm  is  initialized  using  the 
covariance  of  the  random  field.  Alternatively,  the  potential  may  be  computed  by  initializing 
a  three-dimensional  analogue  of  the  split  lattice  algorithm  [3].  This  algorithm  is  initialized 
using  the  canonical  spectral  factor  of  the  double  Radon  transform  of  the  observation  field 
covariance. 

All  of  this  is  a  generalization  of  what  the  more  familiar  one-dimensional  Levinson, 
Schur,  and  lattice  algorithms  do,  except  that  the  potential  function,  rather  than  reflection 
coefficients,  characterizes  the  optimal  filters.  Our  nomenclature  for  the  three  new  algo¬ 
rithms  thus  follows  fimction,  rather  thzLn  form,  aJthough  there  are  some  marked  similarities 
in  form  as  well. 

The  procedure  proposed  here  has  three  stages.  The  rzmdom  field  covariance  is  as¬ 
sumed  to  have  a  three-dimensional  displacement  property  (equation  (3-7)  below),  so  that 
its  double  Radon  transform  has  Toeplitz-plus-Hankel  structure.  Either  the  random  field 
covariance,  or  the  canonical  spectral  factor  of  the  covariance  of  the  observation  field,  may 
be  used  to  initialize  the  three-dimensional  split  Schur  or  lattice  algorithms,  respectively. 
Both  of  these  algorithms  compute  a  three-dimensional  version  of  the  potential  peurameter 
appearing  in  the  one-dimensional  split  Levinson,  Schur,  and  lattice  algorithms  [3].  Next, 
this  potential  parameter  is  used  in  the  three-dimensional  split  Levinson  algorithm  to  com¬ 
pute  the  Radon  trEuisform  of  the  filter  for  estimating  the  random  field  on  the  surface  of  a 
sphere  of  noisy  observations.  Finally,  the  smoothing  filter  for  estimating  the  random  field 
inside  the  sphere  of  observations  is  obtained  from  this  filter.  A  similar  approach  was  used 
for  one-dimensional  random  fields  with  Toepiitz  covariances  in  [4],  and  for  two-dimensional 
isotropic  random  fields  in  [5]. 

It  is  important  to  emphasize  that  NO  assumptions  are  made  on  the  order  of  processing 
of  the  data.  The  filters  themselves  are  generated  recursively,  but  the  data  is  not  processed 
in  any  specific  order.  The  fast  algorithm  is  due  entirely  to  the  displacement  property 


(3-7)  of  the  random  field  covariance,  which  is  the  three-dimensional  gener2ilization  of  the 
Toeplitz  structure  exploited  by  the  one-dimensional  Levinson  and  Schur  algorithms. 

The  numerical  performance  of  the  new  algorithms  has  not  yet  been  studied,  and  so 
they  should  be  viewed  as  only  proposed  numerical  procedures.  However,  the  insight  these 
algorithms  give  into  the  three-dimensional  estimation  problem,  and  the  way  in  which  they 
demonstrate  how  results  for  one-dimensional  and  isotropic  two-dimensional  random  fields 
generadize  to  three  dimensions,  is  of  some  interest. 

The  paper  is  organized  as  follow.  Section  2  quickly  reviews  the  one- dimensional  split 
Levinson,  Schur,  and  lattice  algorithms  of  [3].  Section  3  specifies  the  problem  in  de¬ 
tail,  discusses  the  generailized  displacement  property  (3-7),  and  quickly  reviews  the  Radon 
transform.  Section  4  derives  the  differential  form  of  the  three-dimensional  Wiener-Hopf 
integral  equation,  aind  deri.es  new  fast  algorithms  to  obtain  the  three-dimensional  split 
Levinson,  Schur,  and  lattice  algorithms.  Section  5  notes  how  the  smoothing  filter  is  ob¬ 
tained,  and  summarizes  the  three-stage  procedure.  Section  6  concludes  by  summarizing 
the  paper  cind  noting  directions  for  possible  future  research.  Some  derivations  are  relegated 
to  Appendices. 

2.  The  One-Dimensional  Split  Algorithms.  We  quickly  summarize  the  one¬ 
dimensional  split  Levinson,  Schur,  and  lattice  algorithms  of  [3],  and  discuss  briefly  their 
scattering  interpretations.  It  should  be  noted  that  these  algorithms  arise  in  the  contexts 
of  inverse  scattering  [6],  network  synthesis  [7],  and  orthogoned  polynomials  [8].  For  a 
historical  overview  of  their  place  in  estimation  theory,  see  [9]. 

2.1  Classical  Levinson  Algorithm.  Consider  the  one-dimensionzJ  Unear  prediction 
problem  of  estimating  the  present  value  of  a  zero-mean,  stationary,  discrete-time  random 
process  x(j )  from  observations  t  —  n  <  j  <  i  —  1 }  of  its  past  n  values.  It  is  well  known 

that  the  optimal  Unear  prediction  filter  coefficients  can  be  obtained  using  the  Levinson 
algorithm.  Let  R{z)  be  the  z-transform  (where  z  is  the  unit  delay  operator)  of  one  side  of 


the  covariance  sequence  of  i(t).  Then  the  nth-order  prediction  error  filter  An{z)  can  be 
recursively  computed  as  follows  [10]: 


^n(2) 

[5„(2) 

1  zk„  A„_i(2) 
kji  z  _S„_i(z) 

(2  -  la) 

^n-f-1  — 

-A„iz)Riz)/iz^+^P„)U^o 

(2-16) 

P„  =  (1  -  kl)Pn-l 

(2  -  Ic) 

Ao(z)  =  Bo{z)  =  1 

(2  -  Id) 

In  (2-lb)  and  the  sequel,  the  notation  /(2)|j=o  denotes  the  constant  term  in  the  Laurent 
expansion  of  f{z).  Equations  (2-1)  also  recursively  compute  the  backwards  prediction  error 
filter  B„(z).  This  is  the  error  filter  for  estimating  x{i  —  n  —  1)  from  its  future  n  values 
{i(j),i  -  n<j<i-l}. 

The  {ki,i  =  1 . .  .n}  characterize  the  optimal  prediction  filters  of  all  orders  up  to  n: 
given  {ki,i  =  1 . . .  n),  (2-la)  could  be  used  to  compute  all  of  the  prediction  error  filters 
{Ai(z),i  =  1  ...n},  even  though  the  latter  have  a  total  of  n(n  -f  l)/2  coefiicients.  The 
ki  are  called  reflection  coefficients,  since  equations  (2-1)  can  be  implemented  on  a  lattice 
filter  in  which  signals  in  one  rEiil  are  scattered  into  the  other  rail,  with  gain  ki  in  the  ith 
section  of  the  lattice  [llj.  This  is  illustrated  in  Fig.  1. 

Note  that  the  signal  propagation  in  the  lattice  filter  (Fig.  1)  is  similar  to  the  wave 
propagation  in  a  one-dimensional  scattering  medium  probed  with  an  impxilsive  wave  at 
the  left  end.  In  this  case  ki  is  the  reflection  coefficient  a.  the  ith  interface,  which  reflects 
part  of  the  wave  travelling  in  one  direction  into  the  wave  travelling  in  the  other  direction. 
The  connection  between  one-dimensional  scattering  and  linear  prediction  has  been  noted 
in  [6]  and  [12];  as  we  shall  see,  this  connection  generalizes  to  three  dimensions  [13]. 

In  the  Levinson  algorithm  the  reflection  coefficients  ki  are  computed  using  (2-lb), 
which  is  called  the  “inner  product”  computation.  Equations  (2-1)  require  3n  multiplica¬ 
tions;  one-third  of  these  are  in  (2-lb).  Worse,  this  is  a  non-parallelizable  computational 
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bottleneck;  it  would  be  desirable  to  avoid  this  computation  if  possible.  This  motivates  the 
next  two  algorithms. 

2.2  Classical  Schur  Algorithm.  If  the  recursions  (2-la)  are  initialized  using  R(z) 
instead  of  (2-ld),  the  result  is  the  Schur  algorithm  [14]: 


■  ^n(z)  ■ 

1  zkfi 

'Un-,{Z)' 

[^n(^)J 

_  kn  Z 

Dn-liz) 

k„+i  =  -l/n(z)/(zD„(z))U=o  (2  -  26) 

Bo(z)  ^  1  +  R(z);  Uo(z)  =  R{z)  (2 -2c) 

The  Schur  algorithm  can  be  stated  in  several  different  forms;  we  chose  this  form  so 
that  the  recursions  (2-2a)  match  (2-la).  In  comparing  (2-2)  with  [3],  we  have  Uk{z)  = 
T.’jik+i  Dkiz)  =  ^k.k-jz^  for  the  c.j  and  n  of  [3]. 

The  scattering  interpretation  of  the  Schur  algorithm  is  as  follows.  The  Schur  algorithm 
propagates  the  waves  in  the  lattice  structure  of  Figure  1  resulting  from  an  impulsive 
initialization  (the  “1”  in  (2-2c))  at  its  left  end.  Hence  it  computes  the  ki  from  the  reflection 
response  R(z). 

Note  that  in  the  Schur  algorithm  the  A:,-  are  computed  using  (2-2b),  which  is  not 
am  “inner  product”  computation  (it  requires  only  a  single  division).  Hence  the  Schur 
algorithm  can  be  propagated  in  parallel  with  the  Levinson  algorithm,  solely  for  the  purpose 
of  computing  the  reflection  coefficients  ki,  and  thus  avoiding  the  inner  product  (2-lb) 
required  by  the  Levinson  algorithm  alone  [15]. 

2.3  Classical  Lattice  Algorithm.  Now  let  X{z)  be  the  spectral  factor  of  the  two- 
sided  covariemce  sequence  of  x(i),  i.e., 

1  +  R(z)  +  R(l/z)  =  X(z)X(l/z).  (2-3) 


If  the  recursions  (2-la)  are  initialized  using  X(z)  instead  of  (2-ld),  the  result  is  the  lattice 


algorithm  [16]; 


\Fn{z)] 

■  1 

zkji 

■F„_i(z)' 

[Gn{z)\ 

kn 

z 

.Gn-l(^). 

kn+l  =  -Jf„(z)G„(l/z)/(z"+'P„)U=0 
p„  =  (1  -  kl)pn.^ 

Foiz)  =  Go(^)  =  X(2) 


(2 -4a) 
(2  -  46) 
(2 -4c) 
(2  -  4d) 


Note  that  in  the  lattice  algorithm  an  “inner  product”  computation  (2-4b)  is  required. 
Hence  its  only  advantage  ever  the  Levinson  algorithm  is  that,  given  knowledge  of  X{z) 
instead  of  R{z),  it  avoids  the  computation  (2-3). 

The  scattering  interpretation  of  the  lattice  algorithm  is  as  follows  [17].  The  lattice 
algorithm  propagates  the  waves  in  the  lattice  structure  resulting  from  an  impulsive  initial¬ 
ization  at  its  right  end.  Hence  it  computes  the  ki  from  the  transmission  response  X(z). 
The  reflection  response  R(z)  and  transmission  response  X(z)  are  related  by  (2-3)  [18]. 

2.4  Split  Levinson  Algorithm.  There  is  some  redundancy  in  the  above  algorithms. 
Defining  h„(2)  from  (2-la)  as 

h„(z)  =  An(z)  +  zB„(z)  (2-5) 

it  may  be  shown  using  (2-1)  [3]  that  hniz)  satisfies  the  three-term  recurrence 

/i„+i(z)  =  {z  +  l)hn(z)  -  za„h„_i(z)  (2  -  6a) 

hoiz)  =  l  +  z-,  h_i(z)  =  2  (2-66) 

and  that  On  may  be  computed  using 

<■.  =  </<:!;  <  =  (2-7) 

Equations  (2-6)  and  (2-7)  constitute  the  split  Lemnson  algorithm.  h-i{z)  is  defined  from 
(2-5)  to  initialize  the  three-term  recurrence.  The  point  is  that  the  two  coupled  recursions 
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(2- la)  are  replaced  by  the  three- term  recurrence  (2-6).  Since  (2-6)  only  requires  n  multi¬ 
plications,  while  (2-la)  requires  2n,  using  (2-6)  saves  50%  of  the  multiplications.  However, 
note  that  an  “inner  product”  computation  (2-7)  is  still  required  at  each  recursion. 

The  {fli}  characterize  the  optimal  filters  of  all  orders  just  as  the  {fc,}  do;  indeed  we 
have  [3] 

a„  =  (l  +  fc„)(l-*n-i)-  (2-8) 

Also,  the  quantity  Sn{z)  =  h„(z)/ty”,  where  w  =  satisfies 

5„+i(z)  -f-  Sn-l(z)  -(w  +  \/w)Sn{z)  =  V„‘5n-l(-?)  (2  -  9a) 

Vn  =  l-Cln  (2-96) 

Equation  (2-9)  has  the  form  or  a  discrete  Sckrodinger  equation  [19].  Since  a  scattering 
interpretation  can  be  assigned  to  the  lattice-based  algorithms,  a  reformulation  of  these 
algorithms  in  terms  of  a  discrete  Schrodinger  equation  is  not  surprising. 

Note  that  the  scattering  potential  Vn  is  simply  1  —  a„;  in  the  sequel  we  refer  to 
a„  as  a  potential.  Thus  the  split  algorithms  can  be  interpreted  as  propagating  the  field 
qu£intitites  (voltage,  pressure,  etc.)  associated  with  the  scattering  medium,  while  the 
clcissical  algorithms  propagate  waves  in  the  scattering  medium.  For  more  details  see  [17]. 

Since  the  decomposition  of  the  field  quantity  into  forward  and  backward  travelling 
waves  is  not  possible  in  three  dimensions,  only  the  split  algorithms  can  be  generalized  to 
three  dimensions.  The  potential  Vn  defined  in  (2-9b)  gener^dizes  to  three  dimensions  (see 
(4-2)  below),  but  there  are  additional  dependencies  in  it. 

2.5  Split  Schur  Algorithm.  Defining  t;„(z)  &om  (2-2a)  as 

t;„(r)  =  Un{z)  +  zDn{z)  (2  -  10) 

it  may  be  shown  using  (2-2)  [3]  that  Vn{z)  and  o„  can  be  computed  using  the  split  Schur 
algorithm: 


V„4.i(r)  =  (z  -I-  l)Un(2)  -  ZanVn-l(2) 


(2  -  11a) 


=  <=»,(z)/z"+‘|„o  (2-lU) 

vo(z)  =  2  +  Jt(z)  +  zR(z);  v-i(2)  =  1  +  2Ii(z)  (2  —  11c) 

v-i(z)  is  defined  from  (2-10)  to  initialize  the  three-term  recurrence.  In  comparing  (2-11) 
with  (18)  of  [3],  note  that  Vk(z)  =  52>=ifc+i  for  the  Vij  and  n  of  [3]. 

As  with  the  Levinson  algorithm,  the  split  Schur  algorithm  (2-11)  requires  only  50% 
as  many  multiplications  as  the  classical  Schur  algorithm  (2-2)  Also,  note  that  there  is 
no  “inner  product”  computation,  so  that  the  split  Levinson  and  Schur  algorithms  can  be 
propagated  together,  with  the  split  Schur  algorithm  replacing  the  “inner  product”  (2-7). 

2.6  Split  Lattice  Algorithm.  Defining  Un(^)  from  (2-4a)  as 

u„(r)  =  F„(z)-hzG„(2)  (2-12) 

it  may  be  shown  using  (2-4)  [3]  that  Un(z)  and  a„  can  be  computed  using  the  split  lattice 
algorithm: 

Un+i(z)  =  (2  +  l)uniz)  -  za„u„-i(2)  (2  -  13a) 

On  =  Khn~\'t  K  =  «n(^)«n(l/-?)U=0  (2  -  136) 

uo(z)  =  (!-)-  z)X(z);  u-i(z)  =  2X(z)  (2  -  13c) 

u-i(2)  is  defined  from  (2-12)  to  initialize  the  three-term  recurrence.  Ageiin  (2-13a)  reqmres 
only  50%  as  many  multiplications  m  (2-4a).  However,  the  “inner  product”  (2-13b)  is  still 
necessary.  Given  knowledge  of  X(z),  instead  of  R(2),  the  split  lattice  zilgorithm  could  be 
used  to  compute  the  fc,  without  the  computation  (2-3). 

2.7  Continuous  Parameter  Forms.  The  continuous-parameter  form  of  the  three- 
term  reciurence  (2-6)  (and  edso  (2-lla)  and  (2-13a))  is  determined  by  noting  that  (2-6) 
is  related  to  a  discrete  Schrodinger  equation  (2-9a)  by  a  simple  delay.  The  continuous- 
parameter  Schrodinger  equation  in  the  time  domain  is 


"  V{x)hix,y) 


(2  -  14) 
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where  h{x,y)  is  the  continuous-parameter  version  of  hniz)  (y  is  time).  Equation  (2-14) 
describes  a  continuous  one-dimensional  scattering  medium  with  continuous  scattering  po¬ 
tential  V(x)  (the  continuous  version  of  (2-9b)).  It  is  also  the  equation  for  a  vibrating, 
elastically-based  string  "sed  in  f4]  fcr  one-dimensional  linear  estimation  problems. 

In  the  following  sections  the  three-dimensiontil  version  of  (2-14)  is  used  for  three- 
dimensional  hnear  estimation  problems.  It  should  be  clear  why  this  can  be  construed 
as  a  three-dimensional,  continuous-parameter  emalogue  of  the  three-term  recurrences  that 
constitute  the  one-dimensional  split  algorithms. 

3.  Basic  Equations 

3.1  Problem  Specification.  The  basic  problem  is  as  follows.  Let 

w(x)  —  s{x)  -}-  t;(x),  X  e  (3  —  1) 

be  some  noisy  observations  of  a  zero-mean  real-valued  random  field  s(i)  having  covariance 

f;[s(x)s(y)]  =  k{x,  y).  (3  -  2) 

i;(x)  is  a  zero-mean  real-valued  white  noise  field  with  covarizince 

E[u(x)u(y)]  =  6{x  -y)  (3  -  3) 

and  v(x)  is  uncorrelated  with  s(x). 

We  wish  to  compute  the  linezu-  least-squares  estimate  s(i)  of  s(x)  given  the  noisy 
ob  -^rvations  iy(x)  inside  a  sphere  of  radius  T,  i.e.,  given  {uj(y),  jy]  <  Tj.  To  be  exact,  we 
wish  to  compute  the  conditional  mean  E[s(x)|W],  where  W  is  the  Hilbert  space  spsmned 
by  {u)(y),|y|  <T}.  The  estimation  problem  then  reduces  to  computing  the  optimal  filter 
g{x,y;T),  which  in  ttum  yields  J(i)  by 

T 

Hx)  =  j 9(x,y;T)w(y)dy  =  J^g(x,\y\e;T)w{\y\e)\y\‘^ded\yl  |x|  <  T.  (3-4) 


Here  S  is  the  unit  sphere  and  y  =  |y|e,  where  c  is  a  unit  vector,  de  is  the  differential  area 
on  the  surface  of  the  unit  sphere  S;  in  standard  spherical  coordinates  de  —  sm6d9d4>. 
By  the  orthogonality  principle,  g{x^y\T)  solves  the  three-dimensional  Fredholm  integral 
equation  of  the  second  kind 

Hx,y)  =  9{x,y,T)  +  f  f  g(x,re]T)k(re,y)r^dedr,  0  <  |x|,  |y|  <  T  (3-5) 

Jo  Js 

Most  of  this  paper  will  be  concerned  with  the  intermediate  problem  of  computing  the 
linear  least-squares  estimate  of  s(i)  given  the  noisy  observations  {u^(y),  |y|  <  |x|}.  This  is 
the  filtering  problem  of  estimating  s(i)  on  the  surface  of  the  sphere  of  observations.  It  can 
also  be  viewed  as  the  three-dimensional  analogue  of  the  linear  prediction  problem  solved 
by  the  Levinson  algorithm  in  one  dimension.  The  forward  and  backward  predictors  for 
either  end  of  the  segment  of  observations  generalize  to  the  predictors  for  all  points  on  the 
surface  of  the  sphere  of  radius  [ij. 

The  optimal  filter  for  this  problem  is  h{x,  y),  for  which  the  Fredholm  integral  equation 
(3-5)  becomes  the  Wiener-Hopf  integral  equation 

k(x,y)  =  h(x,y)  +  f  h{x,z)k{z,y)dz.  |yl  <  |a;|-  (3-6) 

Without  loss  of  generality,  we  define  /»(x,y)  =  0  for  |y(  >  |x|.  h{x,y)  can  be  viewed  as 
the  anzdogue  of  a  continuous-parameter  quarter-plane  autoregressive  filter,  except  that  the 
causality  is  defined  in  terms  of  lx|  and  |yl,  so  that  there  is  no  “comer”  and  no  ambiguity 
over  in  which  direction  to  proceed.  In  Section  5  we  show  that  g(x,  y;  T),  the  ultimate  goal, 
can  be  obtained  easily  from  h(x,  y). 

The  function  k{x,y)  is  assumed  to  be  positive  definite,  and  it  is  assumed  to  have  the 
generalized  displacement  pivperty  [13] 


(A,  -  A,)*:(x,y)  =  0 


(3-7) 


where  Az  is  the  Laplsurian  with  respect  to  x  €  and  similarly  for  Ay.  Equation  (3- 
7)  is  a  direct  generalization  of  the  Toeplitz-plus-Hankel  structure  exploited  by  the  one- 
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dimensioned  Levinson,  Schiu-,  eind  lattice  algorithms.  The  structure  (3-7)  of  the  covariance 
makes  possible  fast  algorithms  for  solving  the  integral  equation  (3-6). 

The  structure  of  k{x,y)  implied  by  (3-7)  reduces  the  number  of  degrees  of  freedom 
in  the  fimction  k{x,y)  from  six  to  five.  This  is  still  far  more  generzd  than  the  CEise  of  a 
homogeneous  random  field  having  covariance  k{x  —  y)  (three  degrees  of  freedom)  treated 
in  [1],  or  the  case  of  zm  isotropic  random  field  having  covariaince  A;(|i  —  y|)  (one  degree 
of  freedom)  treated  in  [5].  Note  that  both  homogeneous  and  isotropic  random  fields  2ire 
included  as  special  cases  of  the  property  (3-7).  Note  also  that  not  all  three  components 
of  I  and  y  need  refer  to  spatial  variables;  a  two-dimensionzd  time-varying  r^lndom  field 
whose  spatial  covariance  satisfies  the  two-dimensional  version  of  (3-7),  and  which  is  also 
stationary  in  time,  would  satisfy  (3-7). 

3.2  The  Radon  Transform.  The  Radon  transform  will  be  used  extensively  through¬ 
out  this  paper.  The  Radon  transform  of  a  function  /(x),x  €  is  defined  as 

'^{/(•)}(^i  e)  =  /(r, c)  =  y  f{x)8{T  -  e  •  x)dx  (3  -  8) 

so  that  it  is  the  integrzil  of  /(x)  over  the  plane  t  =  e  •  x.  Note  that  /(r,  e)  =  /(— r,  — e). 
Thci  inverse  Radon  tremsform  is 

/(x)  =  7l-^{/(-,-)}(x)  = y  ■^f{T,e)6{T-e-x)dTde  (3-9) 

A  good  treatment  of  the  Radon  transform  is  [20]. 

An  important  property  of  the  Radon  tr^lnsform  is  the  projection- slice  theorem  [20] 

nfmr,e)  =  :F-f^,{:F,^|t|e{/(x)}}  (3  -  10) 

Here  denotes  a  one-dimensional  inverse  Fourier  transform  taking  \k\  into  t,  with  \k\ 

extended  to  negative  values  by  conjugate  symmetry.  Tg-*\k\t  denotes  a  three-dimensional 
Fourier  transform  taking  x  €  into  |fc|e. 
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Another  important  property,  which  is  the  motivation  for  using  the  Radon  transform 
in  this  paper,  is  [20] 

=  |iK(/(.))(r,e).  (3-11) 

Using  (3-11),  it  may  be  shown  that  a  covariance  function  satisfying  (3-7)  will  have  a 
Toeplitz-plus-Hankel  structure  in  the  double  Radon  transform  domain.  To  see  this,  take 
the  double  Radon  transform  of  (3-7).  This  gives 

/  52  32  \  * 

y  ^^2  J  » ^2>  )  —  0  (3  12<i) 

fc(Ti ,  T2,  Cl ,  62 )  ~  ‘rjiej  {^(®»1/)}'  (3  126) 

where  denotes  the  Radon  transform  taking  x  €  into  (Ti,ei).  This  in  turn 

implies  the  existence  of  fimctions  fci(-)  and  k2{-)  such  that 

fc(n,T2,ei,e2)  =  fci(ri  -  r2,ei,e2)  -I-  *2(^1  +T2, 61,62)  (3  -  13) 

i.e.,  that  A:(ri,T2, 61,62)  has  Toeplitz-plus-Hankel  structure.  This  is  the  structure  that 
makes  possible  a  fast  algorithm  solution  to  (3-6). 

4.  Three-Dimensional  Split  Algorithms.  In  this  section  fast  algorithms  for 
computing  the  filter  h{x,y)  from  the  covariEuice  k{x,y)  are  derived.  These  algorithms 
2ire  three-dimensionzil  anedogues  of  the  split  edgorithms  discussed  in  Section  2.  The  bcisic 
recursion  is  a  three-dimensional  generalization  of  (2-14). 

4.1  Differential  Form  of  the  Wiener-Hopf  Equation 
A.  The  Differential  Form  in  x  and  y 

Applying  the  operator  (Ag  —  A,)  to  the  integral  equation  (3-6)  and  using  the  gener- 
ahzed  displacement  property  (3-7),  Green’s  theorem,  and  the  unicity  of  solution  to  (3-6) 
when  Ar(x,  y)  is  positive  definite  and  both  k{x,y)  and  h(x,y)  are  yields,  after  some 
algebra  (see  Appendix  A), 


(A,  -  A,)h(x,y)  =  j^V{x,t)h{\x\e,y)\x\^de 


(4-1) 
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where  the  non-local  filter  potential  V{x,e)  is  defined  as 


(4-2) 


Note  tha^"  although  the  Wiener-Hopf  equation  (3-6)  is  only  valid  for  |y|  <  |x|,  the  differen¬ 
tial  form  (4-1)  is  valid  for  all  x  and  y,  since  for  jy]  >  jx]  we  have  trivially  0  =  0.  Equation 
(4-1)  is  a  direct  generalization  of  (2-14)  to  three  dimensions;  the  only  difference  is  the 
extra  dependence  in  the  potential  ^^(1,6).  Even  this  is  not  surprising;  since  k{x,y)  has 
five  degrees  of  freedom,  h{x,  y)  does  also,  and  thus  the  potential  function  characterizing 
the  h(x,  y)  must  also  have  five  degrees  of  freedom. 

B.  The  Three-Dimensional  Split  Levinson  Algorithm  of  [1] 

In  [1]  the  differential  form  (4-1)  was  propagated  recursively  in  increasing  |x|  and 
|y|  ^  |^l»  yielding  h(x,y).  At  each  recursion,  the  potential  V(x,e)  was  obtained  directly 
from  the  integral  equation  (3-6)  using  k(x,  y)  and  the  previously  computed  values  of  h(x,  y), 
as  follows: 

V(x,e)  = --^-^fx!^  (k(x,(xje)  -  f  h{x,z)k{z,\x\e)dz\  (4-3) 

The  fast  algorithm  proposed  in  [1]  for  homogeneous  random  fields  is  as  follows.  Equation 
(4-1)  is  discretized  into  a  three-term  recurrence  in  increeising  |i|  and  |y|,  and  propagated 
along  with  (4-3).  The  recursion  pattern  for  updating  h{x,y)  in  |x|  and  |yl  using  the 
discretized  (4- a)  is  illustrated  in  Fig.  2. 

Note  that  (4-3)  is  necessary  to  compute  ^^(x,  e),  since  the  boundary  values  h{x,  |x|e) 
emd  their  gradients  appearing  in  (4-2)  cannot  be  computed  using  (4-1)  alone,  due  to  the 
support  of  h(x,  y).  Examination  of  the  recursion  pattern  illustrated  in  Fig.  2  makes  this 
clear.  This  is  analogous  to  the  one- dimensional  Levinson  algorithm,  in  which  kn  is  the 
coefficient  of  z"  in  A„(z);  this  coefficient  is  not  computed  by  (2-la),  so  that  (2-lb)  must 
also  be  used. 

The  computation  involved  in  (4-3),  which  is  a  three-dimensional  analogue  to  the  “inner 
product”  computation  (2-7)  in  the  one-dimensional  split  Levinson  algorithm  (but  much 


worse),  is  excessive.  Furthermore,  it  would  be  desirable  not  to  have  to  compute  Laplacians 
in  both  X  and  y.  The  former  computation  can  be  avoided  using  three-dimensionzil  split 
Schur  or  lattice  algorithms  (see  4.3  and  4.4  below).  The  transverse  part  of  the  Laplacian 
in  y  can  be  eliminated  using  the  Radon  transform,  as  we  now  demonstrate. 

4.2  Three-Dimensional  Split  Levinson  Algorithm 
A.  The  Differeniial  Form  in  x  and  t 

Since  (4-1)  holds  for  all  x  and  y,  we  can  perform  a  Radon  transform  of  (4-1)  talcing  y 
into  t  and  Cj.  Using  (3-11),  this  yields 

(A  -  ^)h(x,t,ei)  =  j^V{x,e)h{\x\e,t,ei)\xfde  (4-4) 

Equation  (4-4)  describes  a  continuous  three-dimensional  scattering  medi\im  with  non-local 
scattering  potentiad  V’(x,e).  Aside  from  the  non-local  nattire  of  the  potential,  equation 
(4-4)  is  a  direct  generalization  of  (2-14)  to  three  dimensions. 

Next,  note  that  the  Laplacian  operator  A  can  be  written  as 


where 


A^  = 


^,2  5  r 

a|xp  |x!  aix| 

1  d 
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(4-5) 


(4-6) 


lx|2  sin  9  69  V  d9j  '  |xP  sin^  (i>  dd>^ 
is  the  transverse  radial  Laplacian  operator  in  spherical  coordinates.  Equation  (4-4)  can 
now  be  written  as 

{(w  (4  -  7) 


where 


(4-8) 


H{x,t,ei)  = -A^h{x,t,€i)-\-  J  V'(i,c)h(lxlc,t,ei)lx|2de 

is  an  auxiliary  quantity.  Equations  (4-7)  and  (4-8)  can  be  combined  into 

(a^  “  =  \x\{j^V{x,e)h{\x\e,t,ei)\x\'^de  -  A^h{x,t,ei))  (4-9) 
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Equation  (4-9),  which  is  the  heart  of  the  three-dimensional  split  Levinson  and  lattice 
algorithms,  shoidd  be  compared  with  (2-14). 

B.  Three-Dimensional  Split  Levinson  Algorithm 

The  three-dimensional  split  Levinson  algorithm  consists  of  (4-9),  propagated  ^ls  a  re- 
ctirrence  in  discretized  |x|  and  f,  and  the  Radon  transform  of  (4-3).  The  recursion  pattern 
for  updating  h(i,:,ei)  using  the  discretized  (4-9)  is  illustrated  in  Fig.  3.  The  discretized 
(4-9)  has  the  same  form  as  the  discrete  Schrodinger  equation  (2-9a),  except  for  the  following 
differences: 

1.  A  separate  set  of  recurrences  is  required  for  each  e,  and  =  x/\x\.  The  recurrences 
are  independent,  and  completely  parallelizable; 

2.  The  simple  multiplication  by  the  potential  in  (2-9a)  and  (2-14)  becomes  a  integration 
over  the  unit  sphere; 

3.  h(x,<,  c,)  and  /f(x,t,  e,)  are  weighted  by  |x|,  since  the  recursion  is  in  the  increasing 
radius  |x|  of  a  sphere; 

4.  The  transverse  Laplacian  must  be  computed  at  each  recursion.  Since  this  involves 
only  values  of  h(x,t,ei)  on  the  surface  of  the  sphere  of  radius  |x|,  this  can  be  done  at 
each  recursion.  It  should  be  noted  that  since  differentiation  is  numericzilly  imstable, 
some  regularizing  procedure  will  be  needed  for  this  computation. 

5.  The  inverse  Radon  transform  of  h{x,t,ei)  must  be  computed  at  the  end  of  the  proce¬ 
dure. 

C.  Computation  of  Boundary  Values  out  =  ±|xl 

Since  h(x,y)  =  0  for  \y\  >  (x|,  we  have  h(x,t,ei)  =  0  for  t  >  (x|.  This  follows  since  the 
plane  t  —  ti  -y  passes  only  through  values  of  y  such  that  ly|  >  <  >  lx],  and  h{x,  y)  =  0  for 
such  values.  Since  the  characteristics  of  (4-9)  are  t  =  ±lx|,  the  recurrence  relation  (4-9) 
will  determine  h{x,t,ei)  for  all  — jx]  <t<  |x|,  and  all  non-zero  values  of  h(x,f,  e,),  except 
for  t  =  ±|x|,  will  be  computed. 

The  points  on  the  characteristics  f  =  ±|x|  that  are  not  computed  in  the  course  of  the 
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recurrences  (4-9)  can  be  found  using 

h{x,i  =  -|x|,e.)  =  h(x,t  =  |i|,  -e^)  (4  -  10a) 

(aR  +  §i)  (4  -  106) 

where  the  latter  is  derived  in  Appendix  B  using  (4-2).  Note  from  (4-lOb)  that  again  1^(1,  e) 
is  not  computed  as  part  of  the  recursions  (4-9)-it  must  be  supplied  separately,  using  the 
Radon  transform  of  (4-3).  Also,  using  (4-9)  and  (4-10),  it  can  be  seen  that  knowledge  of 
y(i,e)  suffices  to  compute  h{x,y).  Thus  the  potentials  V{x,e)  characterize  the  optimal 
filters,  just  as  the  reflection  coefficients  do  in  the  one- dimensional  case. 

As  in  the  one-dimensional  case,  we  now  show  how  three-dimensional  spht  Sch  .  and 
lattice  algorithms  may  be  used  to  avoid  the  computation  (4-3)  in  the  three-dimensional 
split  Levinson  algorithm. 

4.3  Three-Dimensional  Split  Schur  Algorithm.  The  split  Schur  algorithm  must 
be  propagated  in  x  and  y,  rather  than  in  x  and  t.  The  reason  for  this  is  that  the  Schur 
algorithm  propagates  the  convolution  of  the  prediction  error  filter  and  the  observation 
field  covariance,  which  is  zero  for  |y|  <  |x|  by  the  orthogonality  principle.  However,  the 
triangularity  property  of  being  zero  for  |y|  <  ]i|  does  NOT  map  to  the  Radon  transform 
domain.  This  is  unlike  the  Levinson  algorithm,  in  which  h{x,y)  =  0  for  jyj  >  |x|  implies 
h(i,t,e)  =  0  for  <  >  (x|.  Since  the  triangularity  property  is  the  essential  structure  of  the 
Schur  algorithm  (in  one  or  three  dimensions),  we  are  forced  back  to  the  x  —  y  domeiin. 

A.  Differential  Form  in  x  and  y 

In  this  section  we  define  the  residual  error  filter  ^(x,y),  the  residual 
show  that  both  satisfy  the  differential  form  (4-1).  In  doing  so  we  make  use  of  propagation 
of  singularities  arguments,  in  which  coeflBcients  of  different  orders  of  singularities  (delta 
functions,  doublets,  etc.)  are  equated.  This  can  be  viewed  as  equating  coefficients  of  s 
in  Laurent  expansions  of  Laplace  transforms;  similar  reasoning  is  used  to  derive  transport 
equations.  For  more  details  see  [22]. 
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First,  we  must  d  ’fme  the  spectral  density  M(A:,  61,62).  The  structure  (3-7)  of  k{x,y) 
implies  that  its  double  Fourier  transform  is  zero  except  for  its  on-shell  values.  More 
specifically,  the  covEiriance  of  the  observation  field  w{x)  has  the  property  that 

-  y)  +  A:(x,y)}  =  ^^ik^,e^,e2)6i\k^\^  -  |^2p)  (4  -  11) 


for  some  function  M(fc,  ei ,  62). 

As  an  aside,  note  that  the  projection-slice  theorem  (3-10)  implies  that 
fc(Tl,T2,6i,62)  =  :F;/^,,:F-L,^{M(l:i,6i,e2)^(lfciP  -  1^:2!')} 

=  fci(Ti  -  ^2,61,62)  +  hin  +  T2,ei,e2)  (4  -  12) 

so  that  ^(ti  ,  r2, 61, 62)  has  Toeplitz-plus-Hamkel  structure  in  the  double  Radon  transform 
domain,  in  agreement  with  (3-13). 

Next,  define  the  residual  filter  <f>(x,y)  as 


(^(x,y)  =6{x  -y)-  h{x,y) 


(4-13) 


<f>{x,  y)  converts  the  observation  field  iv{x)  into  the  residual  field  u)(x)  — s(x|u;(y),  |y|  v  |x|). 
Finally,  define  the  residual  x(a:,y)  as 

X(x,y)  =  j  4>{x,z){6{z -y)  + k{z,y))dz  (4-14) 

x(x,  y)  is  the  convolution  of  the  prediction  error  filter  and  the  observation  field  covariance, 
just  as  in  the  one-dimensional  case.  Using  Parseval’s  theorem  on  (4-14),  we  have 


X(x,l:2,e2)  -  i,,e,{x(a:,y)}  =  J  ^  0(x,*:3,C3)M*(A:2,C2,63)(5(|fc2p-|*3|^)l*:3l^de3  dl:3 

=  y  <^(x,fc2)e3)Af*(fc2,62,63)|fc2pdC3  (4—15) 


We  now  show  that  '^(x,y)  and  x(®»y)  both  satisfy  (4-1).  Apply  the  operator  Ai  —  Ay 
to  (4-13),  and  recal’  that  /i(x,y)  =  h(x,y)l(Ix|  -  |y|),  where  !(•)  is  the  unit  step  or 
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Heaviside  function  (this  expresses  the  support  constraint  for  h{x,  y)).  Equating  coefficients 
of  singularities  (delta  functions)  and  using  (4-1)  and  (4-2)  shows  that  cf>{x,  y)  satisfies  (4-1). 
Fourier  transforming  (4-1)  with  respect  to  y,  and  a  lineEirity  argument  using  (4-15),  show 
that  xi^iV)  satisfies  (4-1)  as  well. 

B.  Recursions 

By  the  orthogonality  principle  we  have  x(^iV)  =  0  for  ll/l  <  1^1-  Then,  since  <t>(x,y) 
contains  an  impulse  S(x  —  y),  x(x  y)  must  contain  one  also,  and  thus  it  has  the  form 


x(x,y)  =  <5(i  -y)-l-t;(i,y)l(ly|  -  |il) 


(4  -  16) 


Inserting  (4-16)  into  the  differential  form  (4-1)  ^md  equating  coefficients  of  singulzirities 
results  in 

^  (4  -  17a) 


(Ai  -  A5,)t;(i,y)  =  J^V(x,e)v(\x\e,y)\x\^de 


(4  -  176) 


Equations  (4-17)  constitute  the  recursions  for  the  three-dime  : stonal  split  Schur  al¬ 
gorithm.  v(x,y)  is  propagated  in  increasing  |x|  and  |y|  >  |x|  using  (4-17a),  and  F(x,e) 
reconstructed  using  (4-17b).  The  recursion  pattern  for  updating  v(x,y)  is  il’usirated  in 
Fig.  4.  Note  that  1^(1,  e)  is  computed  directly  by  the  recursions  (4- 17a);  no  “inner  prod¬ 
uct”  computation  is  required.  The  computed  V(x,e)  is  then  inserted  in  (4-1)  or  (4-9)  to 
compute  h{x,y)  via  the  three-dimensional  split  Levinson  algorithm,  avoiding  the  “inner 
product”  (4-3). 

C.  Initialization 

The  split  Schur  algorithm  is  initialized  by  setting  lx|  —  0  in  (4-14).  This  results  in 


v(0,y)  =  *(0,y) 


(a|i|  a\y])  (a|x|  +  a|y|)  ‘'“'S'* 


(4  -  18a) 


(4  -  186) 
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Note  that  the  dependence  of  k{x,y)  on  e  =  xl\x\  for  small  |x|  is  needed  in  (4-18b).  This 
ensures  the  five  degrees  of  freedom  in  the  data  necessary  to  compute  V(x,e),  which  also 
has  five  degrees  of  freedom. 

D.  Interpretation 

The  split  Schur  algorithm  propagates  which  is  zero  for  |y|  <  |x|  by  the  or¬ 

thogonality  principle.  This  is  the  stochastic  interpretation.  However,  the  scattering  in¬ 
terpretation  is  more  illuminating.  The  form  of  (4-16)  suggests  that  xi^^v)  results  from 
initieJizing  (4-1)  with  an  impulse  in  x  and  y  at  the  origin  x  =  0,  which  spreads  out  in 
increasing  |x|  along  the  characteristic  |y|  =  |x|  (note  that  |y|  plays  the  role  of  time).  The 
jump  in  v{x,  y),  the  non-impulsive  part  of  x(^>y)>  characteristic  yields  information 

about  the  scattering  potential  V{x,e). 

All  of  this  is  analogous  to  the  one-dimensional  Schur  algorithm;  note  that  for  this 
type  of  scattering  experiment,  the  non-local  nature  of  V{x,e)  does  not  affect  the  support 
of  x(xi  y)-  Note  also  that  since  both  the  excitation  and  the  measurement  takes  place  at  the 
origin,  tnis  is  a  reflection-type  inverse  scattering  problem,  as  opposed  to  the  transmission- 
type  problem  solved  by  the  lattice  algorithm  below. 

This  algorithm  is  called  a  three-dimensional  Schur  algorithm  for  the  following  reasons: 

1.  It  solve  a  reflection-type  inverse  scattering  problem; 

2.  It  propagates  the  residuals  v(x,y),  whose  triangulEu:  structure  stems  from  the  orthog¬ 
onality  principle; 

3.  It  is  initialized  directly  with  the  random  field  covariance  fc(x,y); 

4.  It  performs  a  spectral  factorization  (see  (4-28)  below). 

4.4  Three-Dimensional  Split  Lattice  Algorithm.  In  this  section  we  derive  two 
forms  of  the  three-dimensional  split  lattice  algorithm.  One  form  is  propagated  in  increasing 
|x|,  is  initialized  directly  using  the  spectral  facto--  of  the  covariance  k(x,y),  and  requires 
an  “inner  product.”  The  other  form  is  propagated  in  decreasing  |x|,  is  initialized  at  large 
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|zl  using  the  spectral  factor,  and  does  not  require  an  “inner  product.”  This  second  form, 
which  has  no  one-dimensional  counterpart,  exists  because  the  potential  V{x,e)  in  (4-2)  is 
non-loc^d. 

A.  Spectral  Factorization 

Since  6(x—y)+k(x,  y)  is  the  covariance  of  and  we  are  now  interested  in  deriving  a 
lattice  algorithm,  we  consider  the  spectral  factorization  of  the  spectral  density  M{k,  61,62) 
defined  in  (4-11)  into  (compare  to  (2-3)) 

M(*;,ei,C2)  =  J  F(fc,ei,e3)F(fc,e3,62)*<ie3  (4-19) 

where  F(fc,  ei ,  63)  is  analytic  in  k  in  the  lower  half- plane.  This  factorization  is  a  Riemann- 
Hilbert  problem  (see  (4.1)  and  (6.9)  of  [21]);  Section  2  of  [21]  proves,  subject  to  assumptions 
about  k{x,y)  already  made,  that  this  problem  has  a  unique  solution. 

In  practice,  the  spectral  factorization  (4-19)  would  never  be  performed;  unless  F(fc,  ci ,  62 ) 
is  known  initially  in  lieu  of  k(x,y),  there  is  no  point  is  using  the  split  lattice  algorithm.  In 
this  case  the  split  Schur  algorithm,  initialized  using  fc(z,  y),  is  to  be  preferred. 

B.  Recursions 

Let  <^(x,  k,e\)  =  y)},  where  <l>{x,y)  is  the  residual  filter  defined  in  (4-13). 

Define  “^{x,  k,  e)  using 

V’(i,fc,ei)  =  J^F{k,ei,e2)<f>{x,k,e2)de2  (4  -  20a) 

<^(i,fc,ei)  =  J^F~^{k,ei,€2)rk{x,k,e2)de2  (4  -  206) 

where  F”^(fc, 61,62)  is  the  inverse  kernel  to  F(fc, 61,62). 

Let  <^(i,<,ej)  and  4>{x,t,  e^)  be  the  inverse Foiuier  tramsforms of  ^(i,  k,ei)  and  ^>(2^,  k,ei). 
We  showed  previously  that  <t>{x,y)  satisfies  (4-1),  hence  ^(x,  i,  c,)  satisfies  (4-9).  And  (4- 
20a)  and  a  linearity  argument  shows  that  ^(x,t,  Cj)  also  satisfies  (4-9). 

Since  F(fc,  61,62)  is  the  canonical  spectral  factor  of  6{x  —  y)  +  fc(x,  y),  the  form  of  (4-9) 
(specifically  its  characteristic  at  t  =  —  |x|)  implies  that  tp(x,t,ei)  has  the  form 

0(x,  t,  €i)  =  6{t  -  €i  •  x)  +  u(x,  f,  e,)l(f  +  |x|)  (4  -  21) 
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To  see  this,  note  that  has  support  in  t  on  [— |x|,|x|]  (since  h{x,t,e)  does),  zuid 

F(k,  61,62)  is  causal  in  the  t  domain.  Examination  of  the  convolution  in  t  implied  by 
(4  20a)  shows  that  has  support  in  t  on  [— |x|,ooJ,  yielding  (4-21). 

Inserting  (4-21)  into  (4-9)  and  once  again  equating  coefficients  of  singularities  results 
in 

=  kl(  /  V(i,e)u(|i|e,(,ei)l*P<ie- (i-22a) 

-'<)  =  ‘  =  -1*1’  '•)  =  (S  “  I)  =  -1*1’ 

Equation  (4-22a)  has  the  same  form  as  the  three-dimensional  split  Levinson  algorithm 
(4-9),  and  it  may  be  propagated  in  discretized  |i|  zmd  t  in  the  S£ime  way  that  (4-9)  was. 
Equations  (4-22)  appear  in  [23]  as  a  proposed  fast  algorithm  for  solving  inverse  scattering 
problems  with  non-local  potentials.  Compare  (4-22b)  with  (4-2)  and  (4-lOb). 

C.  Two  Three-Dimensional  Lattice  Algorithms 

Equations  (4-22)  constitute  the  recursions  for  the  three-dimensional  split  lattice  algo¬ 
rithm.  By  propagating  them  in  either  increasing  or  decreasing  |x|,  we  get  two  different 
three-dimensional  lattice  algorithms.  The  recursion  patterns  for  updating  u{x,t,ei)  using 
(4-22)  are  illustrated  in  Fig.  5. 

The  first  eilgorithm  proceeds  by  initializing  u(x,f,e,)  at  the  origin  i  =  0,  using  (4- 
23)  below,  and  propagating  (4-22)  in  increasing  |x|  and  t  >  — |x|.  Fig.  5  shows  that 
V{x,  e)  is  not  computed  directly  by  (4-22);  an  “inner  product”  combining  (4-20b)  with  the 
Radon  transform  of  (4-3)  is  needed.  Because  of  this,  the  first  algorithm  is  computationally 
inferior  to  the  second  algorithm.  However,  it  is  analogous  to  the  one-dimensional  split 
lattice  algorithm. 

The  second  algorithm  proceeds  by  initializing  u(x,  #,  e,)  for  large  jxj,  using  (4-26) 
below,  and  propagating  (4-22)  in  decreasing  |x|  and  t  >  — |i|.  The  advantage  of  this  form 
is  that  V’(x,  e)  id  now  computed  directly  using  (4-22)  (see  Fig.  5);  no  “iimer  product”  is 
needed.  This  makes  it  cleau^ly  superior  for  computation. 
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D.  Initialization 

The  first  form  of  the  algorithm  is  initialized  by  setting  i  =  0  in  (4-20a)  and  using 
(4-21): 

u(0,t,ei)  =  F(t, ci, 62)^63  -  1}  (4-23) 

Note  that  (4-23)  can  be  viewed  as  transmisaion  scattering  data  at  the  origin. 

To  initialize  the  second  form  of  the  algorithm  we  use  scattering  arguments,  following 
[13]  and  [24].  Equation  (4-21)  shows  that  e,)  consists  of  a  probing  impulsive  plane 

wave  S(t  —  ei  ■  x)  and  a  resulting  scattered  field  u(i,t,Ci).  For  |x|  >  T,  we  can  write  this 
in  the  frequency  domain  as 


J  )*1 


(4  -  24) 


where  S'(^,  61,63)  is  a  scattering  operator.  For  large  |x|,  0(x,  — it,  — ei)  represents  solely 
the  probing  plane  wave  by  time  causality.  Equations  (4-20)  and  (4-24)  combine  to  give 
[131,(241 

S(fc,ei,e2)  =  /  F-*(fc,ci,e3)F{-t,e3,e2)(ie3  (4  -  25) 

Inserting  (4-25)  in  (4-24)  allows  the  second  algorithm  to  be  initialized  for  large  |x|  using 

V’(x,A:,e2)  =  J  j  F~*(*,ei,e3)F(-A:,e3,e2)e“’**‘  *de3dei,  [xj -♦  oo.  (4-26) 

E.  Stochastic  Interpretation 

The  various  quantities  appearing  in  the  above  derivations  all  have  important  stochsistic 
interpretations.  We  briefly  summarize  them  here;  for  more  details  see  [13].  (f>{x,y)  is 
the  residual  filter  that  converts  the  observation  field  iy(x)  into  the  residual  field  r(x)  = 
u;(i)  —  i(x|ii;(y),  [y]  <  |x|.  This  residual  field  can  be  decorrelated  on  the  circle  |y|  =  |x| 
to  give  an  innovations  field.  x(®»  v)  is  residual,  or  the  difference  between  the  left  and 
right  sides  of  the  Wiener-Hopf  equation  (3-6);  for  |y|  <  |x|  x(®»y)  =  0  by  the  orthogonality 
principle. 
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F~^(k,  61,62)  is  the  transfer  function  of  a  whitening  filter  that  whitens  w{x)  to  a  white 
field  u{x),  while  F{k,  61,62)  is  the  transfer  function  of  a  modelling  filter  that  transforms  the 
white  field  u{x)  back  into  w{x).  Note  that  F{k,  61,62)  and  F~^{k,  61 , 62)  are  both  causal  in 
the  double  Radon  tremsform  dommn:  F(r, 61,62)  =  ^“*{F()b,6i,62)}  is  caus£il  in  r.  How¬ 
ever,  they  are  NOT  triangular  in  the  spatial  domain:  F(x,  y)  =  ^i ,  62)}  0 

for  ly|  >  |x|.  Hence  the  white  field  t/(x)  is  not  an  innovations  field;  it  cannot  be  obtmned 
from  causal  filtering  of  {iy(i)}. 

Tp(x,  k,  6.)  filters  1/(1)  into  r(x),  as  shown  by  (4-20a):  First  F{k,  61,62)  dewhitens  u(x) 
to  w(x),  then  <i>{x,y)  rewhitens  w(x)  to  r(a:).  Also,  note  that  (4-26)  is  the  generalization 
of  a  well-known  one-dimensional  result  [24]. 

We  now  show  that  the  three-dimensionad  split  Schur  algorithm  performs  the  spectral 
factorization  (4-19).  For  large  |i|,  (4-20b)  becomes 

(^{x,k,62)c-  f  F~^(k,ei,C2)e~‘‘*‘  *(iei  (4-27) 

Js 

Inserting  (4-27)  in  (4-15)  yields 

Xix,k,62)  ^F*(*:,ei,e2)e“’'"'  *dei  (4-28) 

so  that  x(^*^»C2)»  propagated  by  the  Schur  algorithm,  converges  to  the  spectral  factor  of 
the  observation  field.  This  is  a  direct  generalization  of  one-dimensional  results  [14]. 

These  algorithms  are  called  three-dimensional  lattice  algorithms  for  the  following  rea¬ 
sons: 

1.  It  solve  a  tramsmission-type  inverse  scattering  problem; 

2.  It  is  initiadized  directly  using  the  spectral  faetor  F(A:,  61,62)  of  the  ramdom  field  co¬ 
variance  fc(i,y); 

3.  u{x,t,6i)  haa  the  same  support  t  >  — |x|  as  the  one-dimensionad  lattice  adgorithm. 


5.  Computation  of  the  Smoothing  Filter  g{x,  y;  T) 


5.1  Computation  of  g{x,y)  from  h(x,y).  We  now  specify  the  third  and  final 
stage  of  the  estimation  problem:  the  determination  of  the  smoothing  filter  g{x,y;  T)  from 
h(x,y).  Recall  that  g(x,y,T)  is  the  filter  for  estimating  the  ramdom  field  s(x)  from  the 
set  of  observations  {i«(j/),  ly|  <  T}.  Therefore  the  computation  of  g(x,y,T)  completes  the 
solution  to  the  original  estimation  problem.  The  material  of  this  section  is  taken  from  [1], 
and  generalizes  results  in  [4]  and  [5]. 

Recall  that  g(x,y;T)  satisfies  the  Fredholm  integral  equation  (3-5),  while  h(x,y)  sat¬ 
isfies  the  Wiener-Hopf  integral  equation  (3-6).  Taking  the  partial  derivative  with  respect 
to  T  of  (3-5),  and  again  using  the  linearity  and  unicity  of  solution  properties  of  (3-6)  (the 
argument  is  similar  to  that  in  Appendix  A)  results  in  the  differential  form 

^S(i,V;T)  =  -  J  g(x,Te-,T)h(Te,v)T^de.  (5-1) 

Equation  (5-1)  allows  g(x,y,T)  to  be  computed  recursively  from  h(x,y).  The  boimdary 
value  g{x,  Te;  T),  the  only  missing  value  when  (5-1)  is  propagated  recursively  in  increasing 
r  for  all  0  <  ix],lj/|  <  T,  can  be  computed  separately  by  setting  y  =  Te  in  (3-5).  This 
yields 

g(x,Te]T)  =  k(x,Te)  -  J  j  g{x^rer]T)k{rtr,Te)r^dtrdr  (5-2) 

which  computes  y(i,Te;T)  from  already-computed  g{x,z’,T),  \z\  <  T  and  known  k(x,y). 

5.2  Summary  of  Entire  Procedure.  The  complete,  procedure  for  computing 
g(x,y)  from  k(x,y)  or  F(k,  €1,62)  is  as  follows: 

1.  If  k(x,  y)  is  known,  use  it  in  (4-18)  to  initialize  the  split  Schur  algorithm.  If  the  spectral 
factor  F(fc,  61,62)  is  known,  use  it  in  (4-26)  to  initialize  the  split  lattice  algorithm; 

2.  Propagate  the  split  lattice  algorithm  in  decreasing  |z|,  computing  V(x,e)  as  the  re¬ 
cursion  proceeds.  Alternatively,  propagate  the  split  Schur  algorithm  in  increasing 

kl; 

3.  Propagate  the  split  Levinson  algorithm  in  increasing  |i|,  using  the  potential  V’(x,  c) 


computed  in  step  2; 


4.  Compute  h(x,y)  =  This  corresponds  to  the  prediction  filter  in  the 

one-dimensional  Levinson  algorithm,  with  the  prediction  order  being  the  size  T  of  the 
sphere  of  observations; 

5.  Compute  g{x,y)  from  h(x,y)  by  propagating  (5-1)  and  (5-2). 

6.  Conclusion.  Three-dimensional  split  Levinson,  Schur,  and  lattice  algorithms 
for  the  three-dimensional  random  field  least-squares  estimation  problem  have  been  ob- 
tmned.  These  algorithms  directly  solve  the  three-dimensional  Wiener-Hopf  integral  equa¬ 
tion  satisfied  by  the  optimal  filter,  and  make  no  assumptions  about  the  order  in  which 
the  three-dimensional  data  are  to  be  processed.  The  algorithms  are  fast  since  they  exploit 
the  Toeplitz-plus-Hankel  structure  of  the  double  Rndon  transform  of  the  covariance  of  the 
observation  field  w(x),  to  reduce  the  amount  of  computation  necessary  to  solve  the  integral 
equation. 

The  one- dimensional  split  algorithms  are  three-term  recurrences  that  are  equivalent 
(within  a  delay)  to  a  discretization  of  a  one-dimensional  Schrodinger  equation  in  the  time 
domain.  The  three-dimensional  algorithms  of  Section  4  are  equivalent  to  three-dimensional 
Schrodinger  equations  in  the  time  domain,  which  is  why  these  algorithms  are  referred  to  as 
three-dimensional  split  algorithms.  The  cotmections  between  three-dimensional  estimation 
and  inverse  scattering  problems  has  been  detailed  elsewhere  [13];  it  is  worth  noting  here 
that  the  Wiener-Hopf  integral  equation  (3-6)  and  the  differential  form  (4-1)  both  appeared 
in  an  inverse  scattering  context  in  [24]  and  [25]. 

Some  issues  that  need  further  research  are  as  follows.  The  non-local  potential  V(x,  e) 
complicates  matters  enormously,  since  it  has  no  one-dimensional  analogue  and  introduces 
non-causality.  It  wovdd  be  very  desirable  to  be  able  to  characterize  the  set  of  C'  variance 
functions  fc(z,  y),  or  spectral  factors  F(fc,  e\ ,  €2),  associated  with  a  local  potential  V(z,  e)  = 
V’(z)6(z/|z|  —  c)  This  would  IcewI  to  causal  algorithms  much  more  like  the  one-dimensional 
algorithms.  Elements  of  this  set  would  have  three  degrees  of  freedom,  like  the  set  of 


covariance  functions  associated  with  homogeneous  random  fields.  We  note  here  that  this 
is  a  major  unsolved  problem  in  inverse  scattering  theory;  an  estimation  viewpoint  may  well 
be  more  appropriate  for  solving  this  problem.  Another  issue  is  the  numerical  performance 
of  these  algorithms. 
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Appendix  A 

Derivation  of  the  Differential  Form  (4-1 ) 

Applying  the  operator  A*  —  Ay  to  both  sides  of  (3-6)  and  using  the  three-dimensional 
displacement  property  (3-7)  results  in 

(A*  -  Ay)h(x,y) -f- A,  /  h(x,z)k{z,y)dz  -  f  h{x,z)Azk{z,y)dz  =  0  (Al) 

where  (3-7)  has  been  used  again  in  the  Izist  term.  Simplifying  the  middle  term  and  using 
Green’s  identity  on  the  last  term  gives 

(Ai  -  Ay)/i(i,y) -h  f  {{At  -  Az)h{x,z)}k{z,y)dz  =  [  V{x,e)k{\x\e,y)\x\‘^de  (A2) 

•'ld<l*l  Js 

where  V{x,e)  is  defined  by  (4-2). 

In  the  integral  equation  (3-6)  let  z  =  |z|e,  multiply  by  |zpV’(z,  c),  and  integrate  over 
5.  This  gives 

[  V{x,e)h{\x\e,y)\x\‘^de-\- f  !  V{x,e)h{\x\e,z)k{z,y)\x\^dedz  =  f  V{x,e)k{\x\e,y)\x\‘^de. 
Js  •'ld<l*l  Js  Js 

(A3) 

Comparing  (A2)  and  (A3)  shows  that  these  integral  equations  have  the  same  form,  and 
are  therefore  the  same  equation.  Since  the  operator 

K  :  a{t)  — »  6(f)  s=  f  k{t,s)a{s)ds  (A4) 


28 


defined  by  the  covariance  kernel  fc(i,y)  is  self-adjoint  and  non-negative  definite,  the  oper¬ 
ator  K  +  I  is  invertible.  This  means  that  the  solution  of  the  integral  equation  (3-6)  exists 
and  is  unique.  By  linearity,  therefore,  the  solutions  of  the  integral  equations  (A2)  and  (A3) 
must  be  identical.  Equation  (4-1)  follows. 


Appendix  B 

Derivation  of  Equation  (4-lOb) 

Rewrite  (4-2)  as 

V(x,  !,)«(|x|  -  |s,|)  =  -2  (A  +  ^  ^  !,)«(|x|  -  |y|).  (Bl) 


Using  the  property  of  the  Radon  trzmsform  that 


(B2) 


a  Radon  transform  of  (Bl)  taking  y  into  t  and  a  yields 

J  ^(*,y)^(kl  -  |y|)^(<  -  Ci  •  y)dy  =  -2 


where  the  ^(|a:|  —  |y|)  has  been  used  to  convert  l/|y|  to  l/|x|  and  then  pull  it  outside  of 
the  Radon  transform  with  respect  to  y.  Setting  <  =  |z|  reduces  the  left  side  of  (B3)  to 
|zpV'(x,ei),  and  quickly  yields  equation  (4-lOb). 
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Figure  Headings 

FIG.  1.  Lattice  filter  implementing  (2-1)  [26]. 

FIG.  2.  Recursion  pattern  for  updating  h{x,y)  in  the  three-dimensional  Levinson 
algorithm  of  [Ij. 

FIG.  3.  Recursion  pattern  for  updating  e^)  in  the  three-dimensional  split  Levin¬ 

son  algorithm. 

FIG.  4.  Recursion  pattern  for  updating  v(x,y)  in  the  three-dimensional  split  Schur 
algorithm. 

FIG.  5.  Recursion  pattern  for  updating  u{x,t,  e^)  in  the  three-dimensional  split  lattice 
algorithm. 
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I  ABSTRACT 

New  fast  algorithms  for  solving  arbitrary  Toeplitz-plus-Hankel  systems  of  equations 
tire  presented.  The  algorithms  are  tmalogues  of  the  split  Levinson  and  Schur  algorithms, 
^  although  the  more  general  Toeplitz-plus-Hankel  structure  requires  that  the  algorithms  be 

based  on  a  four-term  recurrence;  relations  with  previous  split  algorithms  axe  noted.  The 
algorithms  require  roughly  half  as  many  multiplications  as  previous  fast  algorithms  for 
Toeplitz-plus-Hankel  systems. 
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I.  INTRODUCTION 


Toeplitz-plus-Hajikel  systems  of  equations  have  many  important  applications.  The 
linear  prediction  problem  for  nonstationary  random  processes  with  Toeplitz-plus-Hankel 
covariance  functions  is  one;  the  recently-developed  two-sided  autoregressive  spectral  esti¬ 
mation  procedure  [1]  is  another.  Toeplitz-plus-Hankel  systems  also  appear  in  linear-phase 
prediction  alter  design  [2],  the  Hildebrand-Prony  spectral  line  estimation  procedure  [3], 
and  FADE  approximation  to  the  cosine  series  expansion  of  an  even  fimction  [4].  The 
continuous-time  counterpart  (an  integral  equation  with  a  Toeplitz-plus-Hankel  kernel)  ap¬ 
pears  in  atmospheric  scattering  [5]  and  rairefied  gas  dynamics  [6]. 

Fast  edgorithms  for  solving  Toeplitz-plus-Hankel  systems  have  appeared  in  [7],  in  which 
the  Toeplitz-plus-Hankel  system  is  reformulated  as  a  block- Toeplitz  system,  and  [8],  in 
which  a  set  of  coupled  recursions  is  propagated  in  increasing  predictor  order  ([9]  is  a 
continuous- time  version  of  [8]).  The  new  algorithms  of  this  paper  can  be  viewed  as  split 
versions  of  those  of  [8],  analogous  to  the  split  Levinson  and  Schur  algorithms  of  [10]  being 
split  versions  of  the  classical  Levinson  and  Schur  algorithms.  Alternately,  they  may  be 
viewed  as  analogues  of  the  split  algorithms  of  [10],  applied  not  to  symmetric  Toeplitz 
systems,  but  to  arbitrary  Toeplitz-plus-Hankel  systems. 

The  heart  of  the  new  algorithms  is  a  four-term  recurrence  that  generalizes  the  three- 
term  recurrences  of  [10]  to  Toeplitz-plus-Hankel  matrices.  This  recurrence  requires  two 
multiplications  per  update,  which  is  half  the  number  required  by  the  algorithms  of  [7]-[9]. 
This  is  analogous  to  the  50%  savings  in  multiplications  for  the  split  algorithms  of  [10]  over 
the  clzissical  Levinson  and  Schur  algorithms.  To  save  space  we  refer  to  the  new  algorithms 
as  split  algorithms,  rather  than  analogues  of  split  algorithms,  in  the  sequel. 

In  Section  II  the  basic  four-term  recurrence  for  the  new  split  Levinson  algorithm  is 
derived.  In  Section  III  the  computation  of  generalized  potentials  using  an  “inner  product” 
expression  is  shown;  this  and  the  four-term  recurrence  constitute  the  new  split  Levinson 
algorithm.  In  Section  IV  a  new  split  Schur  algorithm  is  derived;  this  avoids  the  “inner 
product”  expression  reqmred  by  the  split  Levinson  algorithm.  Section  V  shows  how  the 
new  split  algorithms  are  used  to  solve  arbitrary  Toeplitz-plus-Hankel  systems.  Section 
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VI  discusses  how  the  new  algorithms  are  related  to  previous  split  algorithms  in  specied 
cases.  Section  VII  concludes  by  summarizing  £uid  noting  current  research  in  progress  on 
multi-dimensional  versions  of  these  algorithms. 

II.  DERIVATION  OF  THE  FOUR-TERM  RECURRENCE 

A.  The  Basic  Problem 

In  Sections  II-IV  we  consider  the  solution  of  the  Toeplitz-plus-Hankel  system 


where  the  S±i^±i  sure  defined  from  the  {kij]  and  j}  in  (15)  below,  amd  the  ijth  element 
of  the  system  matrix  has  the  form 

ki,j  =  ki{i  —  j)  -b  *2(1  -f  j)  (2) 

for  arbitrary  functions  k\{')  and  k^^').  Note  in  particular  that  the  system  matrix  need 
be  neither  symmetric  nor  persymmetric;  the  only  requirement  is  that  all  of  the  central 
submatrices  be  nonsingular. 

Updating  (1)  from  i  to  z  -b  1  increases  the  size  of  the  matrix  by  two;  this  requires 
two  updates,  and  requires  A:,y2,j/2  be  denned  at,  ualf-integer  values  (i/2,j72).  If  i/2  -bi/2 
is  not  an  integer,  let  kif2,j/2  =  0;  if  */2  -bi/2  is  an  integer,  assign  kif2j/2  such  that  the 
matrix  with  ijth  coordinate  kii2j/2  is  Toeplitz-plus-Hankel.  If  kij  is  specified  by  the  form 
(2),  this  can  be  done  easily  by  inserting  the  half-integer  values  in  the  functions  fci(-)  and 
k2{-)  (note  that  the  arguments  will  always  be  integers);  if  only  the  matrix  (1)  is  given,  see 
Section  V. 

Omitting  the  first  and  last  rows  of  (1)  allows  it  to  be  rewritten  as 

i-l  i-i 

0  =  kij  +  hij+  hi^nknj',  0  =  h-i^nknj, 

n=-(i-l)  n=-(i-l) 
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-f*-!)  <  i  <  *-i. 

(3) 


Now  define  the  interpolated  system  of  (3)  as 


i-l/2 

0  =  ^t+l/2,;+l/2  +  ^i+l/2,i+l/2  +  ^i+l/2,nfcn,i+l/2  —  (*  —  1)  <  j  <  *  —  1-  (4) 

n=-(i-l/2) 

and  similarly  for  —z  — 1/2.  The  interpolated  systems  for  various  orders  aire  auxiliziry  systems 
of  Toeplitz-plus-Hankel  systems  that  are  solved  along  with  (3)  by  the  algorithms  to  follow. 
This  artifice  is  necessary  in  order  to  obtain  split  algorithms  solving  nested  systems  (see 
Section  VI). 

B.  Derivation  of  Four-  Term  Recurrence  for  hij 

To  make  the  derivation  easier  to  follow,  we  consider  only  positive  i.  Define  the  discrete 
wave  operator  A  of  a  function  fij  as 


A/i.j  =  fi+i/2,j  +  fi-i/2,j  -  fi,i+i/2  -  /i, i-l/2  (5) 

A  is  the  discrete  version  of  the  continuous  operator  (-gp  —  Note  that  the  Toeplitz- 

plus-Hankel  structure  (2)  is  equivalent  to 

Afc,,j  =  0;  for  integer  i  -f  j.  (6) 

Apply  the  operator  A  to  (3)  by  writing  (3)  with  i  replaced  with  *  ±  1/2,  and  then  j 
replaced  with  j  ±  1/2,  and  then  adding  and  subtr8u;ting  (4)  appropriately.  Using  (6)  and 
the  definition  (5)  gives 

i-3/2 

0  =  A/ljj  ^  +  ^i+l/Z.i-l/Z^i-l/Z.i  +  ^i-(-l/2,-(i-l/2)^-(<-l/2),i 

n=-(i-3/2) 

i-Z/2 

—  {^i,n+l/2{kn+l/2,j+l/2  ~  ^n,j)  +  ^i,n-l/2(^n-l/2,i-l/2  ~  ^n,j)) 

n=-(i-3/2) 

-^i,i-ifci-i, i-l/2  -  ^i,-(i-i)*-(i-i),i-i-i/2  -  (*  -  3/2)  <j<i-  3/2  (7) 

The  first  sum  in  (7)  has  the  desired  form  for  the  argument  to  follow;  the  second  sum  and 
the  two  extra  terms  following  each  sum  are  all  corrections  to  the  first  sum.  Note  that  (7) 


only  holds  for  — (t  —  3/2)  <  i  <  *  —  3/2,  since  in  deriving  (7)  we  used  (4)  with  i  replaced 
with  i  —  1/2. 

The  second  sum  in  (7)  can  be  simplified  using  (6).  Changing  the  summation  variable 
from  n  to  n  +  1  in  the  second  term  shows  that 

i-3/2 

~  ^  ^  (^i,n+l/2(^n+l/2,i+l/2  ~  ^n,})  "b  — l/2,j  — 1/2  ~ 

n=-(i-3/2) 

i-3/2 

—  ^i,n+l/2^^n+l/2,i  +  ^«,«-l(^t-l,i-l/2-*^i-l/2,i)+^«,-(i-l)(fc-(i-l),j+l/2-*^-(i-l/2),j)- 

n=-(.-l/2) 

(8) 

The  sum  in  (8)  vanishes  by  (6).  Substituting  (8)  into  (7)  and  collecting  the  extra  terms 
on  the  left  side  results  in 

i-3/2 

0  =  +  Ahjj  +  ^2  Ahi^„kn,j  (9) 

n=-(i-3/2) 

where  we  have  defined  the  potentials  (see  [11]  for  a  discussion  of  this  term) 

Vi  =  /»i4.i/2,i_i/2  -  Vi  =  hj+i/2,_(i_i/2)  -  h, (10) 

Equation  (9)  has  the  same  form  as  (4),  with  a  different  left  side.  To  see  this,  write  (4) 
with  i  +  1/2  replaced  with  i  —  1/2  and  — (i  —  1/2),  multiply  by  and  V^,  respectively, 
and  add.  This  gives 

i-3/2 

^/^«-l/2,>+Vi^A:_(,_i/2),>  =  ^i*^i-l/2,j+Vi*h_(,_i/2),j+  ^  (Vi^^,_l/2,n  +  Vi^/l_(,_i/2),n) 

n=-(i-3/2) 

(11) 

C.  Basic  Four- Term  Recurrence  for  hij 

Since  kij  is  nonsingular  by  assumption,  the  solution  1/2, i  +  «-i/2),j  to 

(11)  must  be  unique.  Comparing  (10)  and  (11),  this  implies  that 


Ahij  =  v;^/»j_i/2,j  +  v;^h_(j_i/2),7  ~  (*  “  3/2)  < ;  <  *  -  3/2 
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which  can  be  written  as 


*i+i/2,>  =  ^«,i+i/2 +  *i,j-i/2 +(V'/ -  +  -(i-3/2)  <j<  z-3/2 

(13) 

Equation  (13)  is  the  four-term  recurrence  that  is  the  heart  of  the  new  algorithms.  It 
is  analogous  to  the  three-term  recurrence  on  which  the  split  algorithms  of  [10]  axe  based, 
although  there  axe  some  differences  (see  Section  VI).  Although  we  have  treated  i  as  positive 
throughout  this  derivation,  (13)  also  holds  for  negative  i  and  — (|t|  —  3/2)  <  j  <  |i|  —  3/2. 

III.  NEW  SPLIT  LEVINSON  ALGORITHM 

The  four- term  recurrence  (13)  can  be  propagated  in  increzising  |i|  and  — (li|  —  3/2)< 
j  <  |t|  —  3/2.  Note  that  for  i  an  integer /half-integer,  j  will  take  half-integer /integer  values, 
respectively.  However,  since  (13)  does  not  hold  for  j  =  ±(i  —  1),  we  must  update 
using  (10),  and  similarly  for  Also,  both  (10)  and  (13)  require  V/  and  to 

be  supplied  separately,  computed  from  kij;  note  that  (10)  cannot  be  used  to  compute 
and  since  (10)  is  needed  to  update  We  now  show  how  V/  and  czui  be 

computed  from  previously  computed  hij  and  kij, 

A.  Computation  ofV^  and 

Setting  j  =  i  —  I  ia  (3)  and  (4)  gives 

<-1/2  i-l 

k’i+l/2,i—l/2  —  ~^i+l/2,i—l/2~  ^  ^  ^|■+l/2,n^n,^— 1/2>  ^<,<—1  =  ~^<,i— 1  ~  ^  ^ 

n=-(<-l/2)  n=-(<-l) 

(14) 

The  second  equation  requires  only  kij  (known)  and  hij  (from  the  previous  recursion); 
however,  the  first  requires  which  has  not  yet  been  computed.  Substituting  (13) 

into  the  first  equation  and  a  considerable  amoimt  of  algebra  results  in  the  following.  Define 
the  SchuT  variables 

i-l 

SiJ  =  ^i,j  +  Ki  +  53  J  = 

n=-(<-l) 


where  Si^j  =  1  if  t  =  j  and  is  zero  otherwise.  Note  that  Si^j  can  be  computed  from  known 


kij  and  hij.  Then  it  may  be  shown  that 


•S'i-l/2,i-l/2 

•S'-(i-l/2),.-l/2 

■^1-1/2, 1-1/2  “ 

.‘S'i_i/2,-(,-i/2) 

S-(i_i/2)  ,-(.-1/2). 

. ‘S’.-l/2,-(i-l/2)  —  Si,-i 

(16) 


The  existence  of  a  unique  solution  to  (16)  is  proved  in  Section  V  below.  The  closed-form 
solution  of  (16)  is 


“  ^i,,)  —  5_(,_i/2),i-l/2(‘5i-l/2,-(i-l/2)  —  ‘S'i,-i))  /DET 

(17a) 


—  (*S’i-l/2,i-l/2(‘S'i-l/2,-(i-l/2)  ~  ~  ‘^i-l/2,-(i-l/2)(‘5i-l/2,i-l/2  ~  /DET 

(176) 

DET  =  Sj_i/2,i_i/25_(j_i/2),-(<_i/2)  —  Si-i/2,-(i-i/2)S-(i-i/2),i-l/2-  (17c) 


B.  New  Split  Levinson  Algorithm 


Initialization  :  =  — fc±i,o/(l  +  ^o.o)  (^8) 

Computation  of  Compute  5i,±i  from  kij  (known)  and  hij  (from  previous 

recursion)  using  (15).  Compute  and  from  Si,±i  and  5i_i,±(,_i)  using  (17). 

Update  hij:  Compute  /i±(,+i/2),±(i-i/2)  using  (from  (10)) 

6^<-H/2,i-l/2  =  ^,+i/2,-(i-l/2)  =  (19u) 

^-(i+l/2),i-l/2  =  -I-  Fij  /l_(<+i/2), -(1-1/2)  =  ^-i,-(i-l)  +  ^-i-  (196) 

Compute  /ii+i/2,j,  — (i  —  3/2)  <  j  <  (i  —  3/2)  using  (13).  Compute  by  writing 

(12)  as 

h-{i^\l2),j  =  ^-i,>-H/2  +  ^-i,i-l/2  +  (l^i|  -  l)^-(i-l/2),>  +  (20) 

At  this  point  the  recursion  is  complete.  The  computed  hi^j  for  integer/  half-integer 
i  and  j  solve  the  original  system  (3)/interpolated  system  (4),  respectively;  note  that  two 
recursions  are  needed  to  increase  the  size  of  the  system  (3)  by  two  (i.e.,  update  t  to  i  -f  1). 


The  heart  of  the  algorithm  is  the  foixr-term  recurrence  (13),  which  requires  2i  —  3  mul¬ 
tiplications  to  update  hij.  The  fast  algorithms  of  [7]- [9]  require  roughly  4t  mulriplications 
to  update  hij.  There  is  a  redundancy  in  the  computations  of  [7] -[9]  similau-  to  that  in  the 
classical  Levinson  and  Schur  algorithms;  the  savings  of  roughly  50%  is  amalogous  to  the 
savings  in  the  split  Levinson  and  Schur  algorithms  of  [10]  over  the  classical  algorithms. 

This  algorithm  differs  from  the  split  Levinson  algorithm  of  [10]  in  two  other  respects. 
First,  the  non-symmetric  Toeplitz-p/uj-^anie/  system  matrix  requires  four  sequences 
and  of  potentials  and  the  four-term  recurrence  (13).  The  symmetric  Toeplitz  system 
matrix  solved  by  the  spUt  Levinson  algorithm  of  [10]  requires  only  one  sequence  of  poten¬ 
tials  and  a  three-term  recurrence.  Second,  the  split  Levinson  algorithm  of  [10]  propagates 
not  /lij  but  hi^j  +  this  is  more  efficient  for  symmetric  Toeplitz  matrices,  but  requires 
recovery  of  hij  from  hij  -|-  at  termination. 

IV.  NEW  SPLIT  SCHUR  ALGORITHM 

The  “inner  product”  (15)  computation  reqtiires  t  multiplications;  since  it  is  not  paral- 
lelizable,  it  is  a  computational  bottleneck,  just  as  in  the  classical  Levinson  algorithm.  For 
this  reason,  we  now  derive  a  new  split  Schur-type  algorithm  for  arbitreiry  ToepUtz-plus- 
Hankel  matrices.  This  algorithm  can  be  propagated  in  parallel  with  the  split  Levinson 
algorithm  derived  above;  this  avoids  the  computational  bottleneck  (15).  The  same  idea 
was  used  for  the  clzissical  Schur  and  Levinson  algorithms  in  [12]. 

The  first  step  is  to  show  that  the  forward  prediction  error  filter  satisfies  the  four- term 
recurrence  (13).  From  this,  we  show  that  the  5,j  defined  in  (15)  (now  for  all  j  >  i)  also 
satisfy  (13).  This  implies  that  (13),  initialized  using  kij,  can  be  used  to  compute  Vf  and 
quickly. 

A.  Four- Term  Recurrence  for  Sij 

Define  <f>ij  by 

=  ^i,j  +  hij  (21) 

Clearly  <i>i,j  satisfies  (13)  for  — (i  —  3/2)  <  j  <  i  —  3/2  since  =  hij  for  these  values. 
At  j  =  ±(t  —  1/2)  or  ±(t  -t-  1/2)  <l>ij  satisfies  (13),  since  this  reduces  to  (10).  And  for 


|j|  ^  +  3/2  (13)  reduces  to  0  =  0.  Hence  (13)  with  hij  replaced  with  4>i^j  is  true  for  all  i 
an  integer/half-integer  and  j  a  haJf-integer/integer: 


4>i+\/2,i  —  +  4>i,j-l/2  +  (K*  -  ^)4>i-l/2,j  +  (22) 

Next,  extend  the  definition  Sij  in  (15)  to  all  integers  end  half-integers  i  and  j  such 
that  i  -|-  j  is  an  integer; 

i-l  i-1/2 

Si,j  =  +  hi.nfcn.j’,  ^1+1/2,71^7 

n=-(i-l)  n=— (i-1/2) 

(23) 

From  (3)  and  (4)  Sij  =  0  for  -  (i  —  1)  <  j  <  i  —  1.  Substituting  (2)  and  (21)  in  (23)  gives 

=  XI  +  ki{j  -  n)  -(-  fc2(;  -|-  fl)) 

=  ^i.i  +  ^«.j**i0')  +  ^i->**2(;)  (24) 

where  *  denotes  a  convolution  in  j. 

Since  (22)  is  linear  in  functions  of  j,  it  may  be  convolved  with  ki(j).  Note  that  (22) 
still  holds  if  j  is  replaced  with  —  j,  and  convolve  this  with  k2{j).  Adding  (22)  to  the 
convolution  of  (22)  with  fci(j)  and  the  convolution  of  the  time-reversal  of  (22)  with  ^2(7) 
and  using  (24)  shows  that 

•S'i-n/2.i  =  Sij+1/2  +  S,,y_i/2  +  (V'j^  -  l)5i_,/2,j  +  V'i^5_(,_i/2),J  (25) 

so  that  Sij  cilso  satisfies  the  four-term  recurrence  (13).  Equation  (25)  can  also  be  derived 
by  taking  the  z-transform  in  j  of  (22),  noting  that  the  result  is  unaffected  if  z  is  replaced 
with  l/z,  multiplying  by  the  z-transforms  of  ki{j)  and  k2{j),  and  adding. 

B.  New  Spin  SchuT  Algo-rithm 


Initialization  :  Sqj  —  ko,>;  ‘^±i/2,>+i/2  ~  ^±i/2,>-(-i/2  (26) 

Note  ko^m  and  ^i/2,n-n/2  integer  m  and  half-integer  n  -f  1/2  uniquely  determines  kij 
for  all  i,j ,  i  +  j  an  integer,  using  (6). 


.J  +  l/2 
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Computation  of  ,  V^:  Compute  and  from  Si^±i  and  Si_i/2,±(i-i/2)  using 
(17).  Similar  equations  axe  used  to  compute  Vij  and  V^i- 

Update  Sij,  IjI  >  t  using  (25). 

At  this  point  the  recursion  is  complete.  The  split  Schur  algorithm  can  be  run  in 
parallel  with  the  split  Levinson  algorithm,  supplying  the  potentials  and  while 
bypassing  the  “inner  product”  computation  (15)  ((17)  is  still  necessary),  as  suggested  in 
[12]  for  the  classical  algorithms. 

If  the  original  system  (3)  is  a  discretization  of  an  integral  equation,  then  5,j  <<  1 
and  the  Sij  in  (15)  dominates  the  other  terms  if  i  =  j.  In  this  case  the  solution  to  (16)  is 
simply 


^  Si^-i 


(27) 


which  replaces  the  more  complicated  (17). 

V.  SOLUTION  OF  ARBITRARY  TOEPLITZ-PLUS-HANKEL  SYSTEMS 

The  split  algorithms  above  solve  the  systems  (3)  and  (4);  hence  they  also  solve  (1) 
with  S±i,±i  defined  as  in  (15).  We  now  consider  the  general  problem 


(28) 


where  the  right  side  is  now  arbitrary. 

Define  {cj,  —i  <  j  <  *}  recursively  as  follows.  Let  c±j  be  the  solution  to  the  2x2 
system 

re  .  .  c,  ,1  r..  .1  r6_.  - 

(29) 


1  + 

■  ■  ■  ^i,—i 

'b-i' 

1  +  ki,i . 

. 

.  . 

1 

C-/ 

^-}  ~  I3n=-(;-l)  •^nSn,-j 

J 

.  . 

^}  ~  Yyn=-[j-\)^nSn,i 

Then  the  solution  to  (28)  is  given  by 

i 


(30) 


where  <f>ij  is  defined  in  (21)  and  hij  is  defined  to  be  zero  for  jij  <  |j|.  These  equations 
may  be  derived  easily  by  taking  linear  combinations  (weighted  by  the  c±  of  the  columns 


of  (1)  for  increasing  i  and  equating  to  (28).  Note  how  this  relies  on  the  split  algorithms 
solving  nested  systems  of  equations  as  i  increases. 

We  note  here  that  the  2x2  systems  (16)  and  (29)  have  unique  solutions  if  the  central 
submatrices  of  the  system  matrix  (1)  are  nonsingular.  To  see  this,  suppose  that  the  2x2 
system  matrix  in  (16)  an.l  (29)  is  singular.  Then  the  second  column  is  a  multiple  (say 
m)  of  the  first  column,  and  the  column  vector  [1,  •  *  • ,  (h-ij  —  mhij)^  •  ■  • ,  — m]^  solves  the 
homogeneous  system  associated  with  (1),  which  is  impossible  as  long  as  the  system  matrix 
in  (1)  is  nonsingul2ir. 

If  the  system  matrix  is  specified  by  functions  ki{-)  and  ^2(0  as  in  (2)  (and  [7]),  then 
the  initialization  (26)  for  the  split  Schur  algorithm  is  accomplished  using  (2)  directly  (note 
the  arguments  are  always  integers).  However,  if  the  matrix  (1)  is  given  directly,  then 
k±i/2,j+i/2  must  be  interpolated  from  the  given  values  koj  and  k±ij.  From  (6),  these  can 
be  recursively  computed  as  needed  in  the  split  Schur  algorithm  using 

^±1/2, >+1/2  =  koj  -h  k±ij  —  k±i/2,j-l/2;  ^±1/2, 1/2  =  ^±1/2, -1/2  =  (1  +  ^0,0  +  fc±l,o)/2 

(31) 

VI.  RELATION  WITH  PREVIOUS  SPLIT  ALGORITHMS 


A.  Relation  to  the  Split  Algorithms  of  [10] 


To  show  how  the  new  algorithms  reduce  to  the  split  algorithms  of  [10],  we  first  consider 
the  clciss  of  Toeplitz-plus-Hamkel  matrices  such  that  ki^j  =  k-i^-j.  In  terms  of  (2)  both 
ki{-)  and  ^2(0  are  even  functions;  note  that  covariance  functions  of  time-reversible  random 
processes  have  this  property.  The  set  of  centrosymmetric  matrices  (matrices  that  are  both 
persymmetric  kij  =  k-j^-i  and  symmetric  ki^j  =  kj^i)  is  a  subset  of  this  class.  From  (3) 
/i,,j  =  h-i  -j,  from  (15)  Sij  =  and  from  (17)  and  =  Vij.  Hence  the 

computations  for  t  <  0  can  all  be  dispensed  with. 

We  can  go  further.  Defining 


=  ^«.>  +  ^ 


(32) 


replacing  j  with  —j  in  (12)  and  (25),  and  adding  to  (12)  and  (25)  respectively  results  in 


^^i,j  —  ^®«,>  —  — !,>• 


(33) 


► 
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Adding  (16a)  and  (16b)  allows  Vi  to  be  computed  from  Cij  by 


Vi  —  (Ci— 1,«— 1  —  Ci,«)/Ci— 1,»— 1* 


(34) 


From  (3)  and  (32)  Cij  is  the  solution  to 

^•li  4*  —  ^i,j  d"  ^  (35) 

n=-(i-l) 

The  solution  to  (35)  cjin  be  recursively  computed  using  the  three-term  recurrences  (33), 
along  with  (34).  These  equations  have  virtually  the  same  form  as  the  split  algorithms  of 
[10],  even  though  kij  is  not  Toeplitz. 

To  see  what  is  happening  here,  use  (2)  to  rewrite  the  left  side  of  (35)  as 


ki,j  -I-  h,,_j  =  ki(i  -j)  +  k2ii+j)  +  fci(i  +  j)  d-  k2{i  -  j)  =  k(i  -  j)  ■+•  k{i  +  j)  (36) 

where  k(i)  =  ki(i)  -f  k2(i).  From  (32)  Oij  =  and  the  right  side  of  (35)  can  be 

rewritten  using  this  and  (36),  yielding 

•-I 

kii  -j)  +  k{i  +  j)  =  ai,j  d-  ^  a<,„(fc(n  -j)  +  k{n  +  j)).  (37) 

n=0 

Equation  (37)  is  in  fact  the  symmetric  Toeplitz  system  solved  by  the  split  algorithms  of  [10], 
after  shifting  from  a  one-sided  to  a  two-sided  interval.  This  shows  how  these  algorithms 
are  related  to  the  algorithms  of  this  paper.  Note  that  the  split  algorithms  of  [10]  propagate 
Oij,  not  hij]  hij  must  be  computed  from  Oij  at  the  end. 

If  the  system  matrix  (1)  is  merely  symmetric,  a  more  subtle  simplification  is  possible. 
In  this  case,  the  block- Toeplitz  reformulation  of  [7]  becomes  a  centrosymmetric  block- 
Toeplitz  problem,  and  the  results  of  Section  VI  of  [13]  can  be  used  to  derive  a  three-term 
matrix  recurrence  similar  in  form  to  (13)  and  (20)  combined,  except  that  V?,  = 
However,  tnis  recurrence  does  not  propagate  the  h±i,j  directly,  but  weighted  combinations 
of  them,  and  it  requires  zis  many  multiplications  as  the  algorithm  of  this  paper  (which 
also  works  for  nonsymmetric  matrices).  It  is  more  efficient  in  that  it  requires  only  three 
functions,  instead  of  four,  to  characterize  the  inverse  of  the  system  matrix  (1);  this  is 
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reasonable  since  symmetry  requires  ifci(-)  in  (2)  to  be  an  even  fimction,  removing  a  degree 
of  freedom. 

B.  An  Alternative  Algorithm  for  Non-Nested  Matrices 

If  the  problem  (1)  is  modified  so  that  the  system  matrices  of  different  orders  are  no 
longer  nested,  the  algorithm  takes  a  slightly  different  form.  Consider  the  system 

(38) 

where  kij  is  now  defined  by 

ki,j  =  ki{i-j)  +  k2(i  +  i  -  (n  +  3))  (39) 

and  5"  and  T”  are  defined  as  (compare  to  (15)) 

n  n 

Si  =  fcj+i,j+r,  T”  =  ^y”fc„+2_jf,<+i;  Xq  =yo  =  1-  (40) 

j=0  ;=0 

Defining  the  polynomials 

n  n  oo  oo 

^n(^)  =  F„(r)  =  ^yjr^;  R{z)  = '^ki{j)z^ ;  H{z)  =  '^k2{j)z^  (41) 

J=:0  J  =  0  — OO  — OO 

the  system  (38)  can  be  written  in  polynomial  form  as 

fi(l/z)j:„(r)  +  z"+>ir(j)X„(l/r)  =  . . .  +  SJ  +  5’:+,z"+‘  +  . . .  (42<.) 

R(l/z)i"+'y„(l/z)  +  H(z)Y„{z)  =  . . .  +  ly  +  T„"+,z“+'  +  . . .  (421.) 

where  the  eUipses  indicate  terms  of  lower  and  higher  order  in  Laurent  series. 

Knowing  the  form  of  the  four- term  recurrence,  writing  (42)  for  n,  n  -I-  1,  and  n  —  1, 
and  adding  emd  subtracting  appropriately  gives 

fl(l/z)  {A-„+,(z)  -  (z  +  l)X„(z)  -  zV„‘A-„-,(^)  -  zV-„^"y„-,(l/z)) 
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^H{Z)  -  Z^-^\1/Z  +  l)Xn(l/z)  -  z"+2(l/^)V„lX„_i(l/^)) 


=  ...+5o"+‘  +  5;:+2'z"+2  +  ...  (43) 

provided  that  and  are  chosen  such  that  (compare  to  (16)) 


As  long  as  the  system  matrix  (38)  is  invertible,  the  expression  in  parentheses  in  (43)  must 
be  zero.  Equations  (43)  and  (44)  define  a  four-term  recurrence  for  the  solutions  to  (38). 
Proceeding  as  before,  analogues  of  spht  Levinson  and  Schur  algorithms  may  be  derived. 

This  algorithm  avoids  the  interpolated  system  and  half-integer  recursions  of  the  pre¬ 
vious  zdgorithms.  However,  it  does  not  save  any  computation.  More  importantly,  (38)  and 
(39)  do  not  define  a  nested  set  of  system  matrices  in  increasing  order  n:  the  ijth  element 
of  the  system  matrix  changes  with  order  n  (see  (39)).  Hence  this  algorithm  is  not  useful 
for  updating  problems,  in  which  the  size  of  a  Toeplitz-plus-Hankel  system  is  enlarged  by 
augmenting  the  system  matrix  around  its  edges;  this  type  of  problem  is  common  in  linear 
prediction.  The  solution  of  an  arbitrary  Toeplitz-plus-Hankel  system  also  becomes  more 
complex  than  (29)-(30). 

A  nested  system  of  equations  in  increeising  n  can  be  defined  from  (38)  and  (39)  by 
making  the  substitution  i'  =  i  —  (n  +  3)/2,  j'  =  j  —  {n  +  3)/2.  This  alters  (38)  and  (39) 
to  (1)  and  (2),  respectively;  however,  for  n  even  it  requires  that  the  interpolated  system 
(4)  be  defined.  This  leads  back  to  the  previous  algorithm. 

Although  the  derivation  (43)  is  simpler  than  that  of  Section  II,  it  requires  prior  knowl¬ 
edge  of  the  form  of  the  recurrence.  The  derivation  of  Section  II  derives  the  form  of  the 
recurrence  directly,  and  shows  that  the  Toeplitz-plus-Hankel  form,  rather  thzm  the  purely 
Toeplitz  form,  is  fundamentail  to  the  split-like  recurrences.  It  also  shows  that  matrices 
with  structure  defined  implicitly  (as  in  (6)),  rather  than  explicitly  (as  in  (39)),  can  have 
fast  algorithms  easily  derived  for  them.  In  particular,  this  has  led  to  f2ist  algorithms  for 
block  Toeplitz-plus-Hankel  systems  of  equations  [14]. 

VII.  CONCLUSION 
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New  fast  algorithms  have  been  derived  for  solving  arbitrary  Toeplitz-plus-Hankel  sys¬ 
tems  of  equations.  The  new  algorithms  can  be  viewed  as  analogues  of  the  split  Levinson 
and  Schur  algorithms  of  [10],  but  applicable  to  a  more  general  problem.  The  split  Levinson 
algorithm  recursively  computes  the  solution  using  a  four-term  recurrence,  but  requires  a 
non-paraJlelizable  computation  (15)  to  compute  the  potentizds.  The  split  Schur  algorithm 
computes  the  potentizds  using  a  similar  four-term  recurrence;  using  it  in  parallel  with  the 
split  Levinson  algorithm  obviates  (15)  and  zdlows  the  same  processor  zirchitecture  to  be 
used  for  both  algorithms. 

The  algorithms  presented  in  this  paper  have  two-dimensionzd  analogues  applicable 
to  the  linear  prediction  problem  for  a  two-dimensional  rzmdom  field  [14], [15].  Unresolved 
issues  include  the  numerical  stability  of  these  zdgorithms,  optimzd  processor  architectures 
for  implementation,  and  generalization  to  matrices  with  singular  submatrices. 
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Abstract 

New  generalized  split  Levinson  and  Schur  algorithms  for  the  two-dimensional  linear  least-squares 
prediction  problem  on  a  polar  raster  are  derived.  The  algorithms  compute  the  prediction  filter  for 
estimating  a  random  field  at  the  edge  of  a  disk,  from  noisy  observations  inside  the  disk.  The  covariance 
function  of  the  random  field  is  assumed  to  have  a  Toeplitz-plus-Hankel  structure  for  both  its  radial 
part  and  its  transverse  (angular)  part.  This  assumption  is  valid  for  some  types  of  random  fields,  such 
as  isotropic  random  fieMs.  The  algorithms  generalize  the  split  Levinson  and  Schur  algorithms  in  two 
ways:  (1)  to  two  dimensions;  and  (2)  to  Toeplitz-plus-Hankel  covariances. 


I  INTRODUCTION 


The  problem  of  computing  linear  least-squares  estimates  of  two-dimensional  random  fields  from  noisy 
observations  has  many  applications  in  image  processing.  In  particular,  the  two-dimensional  discrete 
linear  prediction  problem  is  a  useful  formulation  of  problems  in  image  smoothing  and  coding  [1].  If  the 
random  field:  (1)  is  defined  on  a  rectangular  lattice  of  points;  (2)  is  stationary;  and  (3)  has  quarter-plane 
or  asymmetric  half-plane  casuality,  then  the  two-dimensional  linear  prediction  problem  may  be  solved 
using  the  multichannel  Levinson  algorithm  [2,  3]  (modifications  of  these  conditions  are  also  possible). 
By  exploiting  the  Toeplitz-block-Toeplitz  structure  of  the  covariance  function  of  the  stationary  random 
field,  this  algorithm  allows  the  linear  prediction  filters  to  be  computed  recursively  using  significantly 
fewer  computations  than  direct  solution  of  the  two-dimensional  discrete  Wiener-Hopf  equations.  The 
multichannel  Schur  algorithm  computes  the  reflection  coefficient  matrices  from  the  covariance  function; 
propagating  it  in  parallel  with  the  Levinson  algorithm  saves  even  more  computation. 

In  tomographic  imaging  problems  solved  by  filtered  back-projection  [4],  and  in  spotlight  synthetic 
aperture  radar  [5],  data  are  collected  on  a  polar  raster  of  points,  rather  than  on  a  rectangular  lattice. 
.Although  such  data  can  be  interpolated  onto  a  rectangular  lattice,  this  is  necessarily  inexact;  it  also  affects 
the  covariance  function.  For  example,  the  covariance  of  an  isotropic  random  field  on  a  rectangular  lattice 
is  a  Toeplitz  function  of  the  ordinates  and  abcissae,  while  on  a  polar  rjister  it  is  a  Toeplitz-plus-Hankel 
function  of  the  radii.  For  smoothing  noisy  images  and  performing  image  coding  for  images  defined  on  a 
polar  raster,  it  is  clearly  desirable  to  develop  analogues  of  the  multichannel  Levinson  and  Schur  algorithms 
applicable  to  discrete  random  fields  defined  on  a  polar  raster. 

This  paper  develops  these  analogues.  They  generalize  previous  results  in  three  ways:  (1)  the  rcindom 
field  is  defined  on  a  polar  raster;  (2)  the  random  field  is  not  required  to  be  stationary;  rather,  its  covariance 
must  have  Toeplitz-plus-Hankel  structure  in  both  the  radial  and  transverse  directions  (some  important 
cases  of  such  random  fields  are  noted  in  Section  IV);  and  (3)  the  quarter-plane  or  asymmetric  half  plane 
causality  assumption  is  replaced  by  a  more  natural  causality  defined  in  the  radial  direction  only.  The 
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prediction  filters  estimate  the  random  field  at  a  given  point  using  observations  from  all  points  of  smaller 
radius. 

Three  other  features  are  worth  noting  here.  First,  the  algorithms  are  generalized  three-term  recur¬ 
rences,  similar  in  structure  to  the  split  algorithms  [6,  7].  The  one-dimensional  split  algorithm  recursions 
require  only  half  as  many  multiplications  as  the  two-component  lattice  recursions  of  the  Levinson  and 
Schur  algorithms.  Our  two-dimensional  algorithms  are  similarly  computationally  efficient,  which  is  im¬ 
portant  in  two-dimensional  signal  processing.  Second,  the  smoothing  filters  for  estimating  the  random 
field  from  observations  at  points  of  smcdler  and  greater  radii  can  be  easily  computed  [8]  from  the  prediction 
filters  using  a  discrete  multi-dimensional  generalization  of  the  application  of  the  Bellman-Siegert-Krein 
identity  to  the  one-dimensional  smoothing  problem  in  [9].  Indeed,  the  new  two-dimensional  algorithms 
of  this  paper  are  applied  to  arbitrary  Toeplitz-plus-Hankel-block-Toeplitz-plus-Hankel  systems  in  Section 
V. 

Finally,  we  note  that  similar  ideas  have  been  applied  to  continuous-pcirameter  isotropic  [10]  and 
homogeneous  [11]  random  fields,  and  to  random  fields  with  more  general  Toeplitz-plus-Hankel  structure  in 
[12]  and  [13]:  this  paper  can  be  viewed  as  a  discrete  version  of  the  results  of  [13].  Although  the  continuous 
algorithm  can  always  be  discretized,  an  inherently  discrete  algorithm  can  be  expected  to  perform  better 
on  a  computer;  there  are  minor  yet  significant  differences  between  the  results  of  this  paper  and  the 
continuous  results  of  [13]  (see  Section  IV).  Also,  in  some  problems  the  data  are  sampled,  or  only  taken  at 
discrete  points.  These  facts  motivate  us  to  develop  a  discrete  counterpart  of  the  continuous  algorithms. 
We  also  note  that  the  one-dimensional  version  of  this  algorithm  has  been  presented  in  [14],  and  that  a 
summary  of  the  results  of  this  paper  was  presented  in  [15]. 

This  paper  is  organized  as  follows.  In  Section  II,  the  two-dimensional  analogue  of  the  discrete  split 
Levinson  recurrence  for  the  linear  prediction  problem  on  a  polar  raster  is  derived.  The  derivation  is  based 
on  the  assumption  that  both  the  radial  part  and  the  transverse  part  of  the  covariance  have  Toeplitz-plus- 
Hankel  structure.  Section  III  derives  a  corresponding  Schur  algorithm,  to  be  propagated  in  parallel  with 


2 


the  Levinson  algorithm.  Some  examples  of  random  fields  with  covariances  having  Toeplitz-plus-Hankel 
structure  are  discussed  in  Section  IV,  and  comparison  with  the  results  of  [13]  are  made.  In  Section  V,  the 
computational  complexity  of  the  proposed  algorithm  is  evaluated,  and  compared  with  other  algorithms. 
The  solution  to  a  general  Toeplitz-plus-Hankel  block  Toeplitz-plus-Hankel  system  of  equations  is  also 
developed.  Section  VI  concludes  with  a  summary. 

II  DERIVATION  OF  THE  LEVINSON-LIKE  RECURRENCE 

A.  Basic  Problem 

The  problem  considered  is  as  follows.  Given  noisy  observations  {l/i.Ar}  of  a  zero- mean  real- valued 
discrete  random  field  at  the  points  (i,N)  of  a  polar  raster  on  a  disk,  compute  the  linear  leaist- 

squares  predictions  of  for  all  points  on  the  edge  of  the  disk  using  all  the  data  inside  the  disk.  Here 
i  is  an  integer  radius  from  the  origin,  and  N  is  the  integer  index  of  the  argument  (angle);  if  there  are  M 
points  distributed  on  the  circle  of  any  radius,  then  {i.N)  is  the  point  at  radius  i  and  angle  2-kNIM. 

The  observations  {t/,,;v}  are  related  to  the  field  {xi.n}  by  y,,7v  =  x.,n  +  where  is  a  zero- 

mean  discrete  white  noise  field  with  unit  power,  and  {xi,iv}  and  are  uncorrelated  (white  noise  with 

arbitrary  power  cr^  can  be  easily  handled  by  scaling).  The  covariance  of  {x,,;v}  is 

K{i,Nx\j,N2)  =  E\xi,NiXj,Ni]  (1) 

which  is  assumed  to  be  a  non-negative  definite  function  with  Toeplitz-plus-Hankel  structure  in  both 
arguments  (this  is  defined  precisely  in  (13)  and  (14)  below).  Although  an  actual  covariance  would  also 
be  symmetric  function,  symmetry  in  (1)  is  not  required  by  the  algorithms  to  follow;  this  permits  their 
application  to  general  Toeplitz-plus-Hankel  block  Toeplitz-plus-Hankel  systems  in  Section  V. 

The  estimates  of  at  the  edge  of  the  disk  are  computed  from  the  observations  {j/i,Ar}  using 

.-I  M 

2  Kh^\\h^2)y,,N2  (2) 

j=o  yVj=i 

By  the  orthogonality  principle  of  linear  prediction,  the  optimal  prediction  filters  h(i,  N^)  are 


computed  from  the  covariance  N2)  by  solving  the  two-dimensional  discrete  Wiener-Hopf  equa¬ 


tion 

.-1  M 

j,yV2)  = ^  h.(hNi',n,N3)Kin,N3-,j,N2)  (3) 

n=:0N3  =  l 

for  all  0  <  j  <  i  -  1  and  I  <  Ni,  N2  <  M ■ 

The  goal  of  this  paper  is  to  derive  fast  algorithms  for  solving  (3)  for  ^2)  when  K{i,  Ni:j,  N2) 

has  Toeplitz-plus-Hankel  block  Toeplitz-plus-Hankel  structure. 

For  convenience  in  the  derivation,  we  solve  not  (3)  but  the  system  of  equations 

1-1  M 

K{i.Ni-,j,N2)  =  hii,Ni-J,N2)+  53  5Z  (4) 

n=-(i-l)iV3  =  l 

for  all  -(i  -  1)  <  _/  <  i  -  1  and  1  <  A'^i,iV2  <  M.  This  modified  system  (4)  is  motivated  by  noting  that 
the  continuous-parameter  two-dimensi  >  lal  Wiener-Hopf  integral  equation 


K{x,y)  =  h{x,y)+  [  h{x,z)K(z,y)dz 

=  Hx,y)+  f  I  h(x.\z\0)K{\z\B,y)\z\ded\z\,  x,y,z  e  B? ,\y\  <\x\  (5) 

JQ  Jo 


discretizes  into 


1—1  Af 

h"(i,y.-j,N2)  =  53  E  ^i'{i,Ni-,n,N3)K'{n,N3\j,N2)n 

n=0  A?3  =  1 

where  the  radial  weighting  factor  n  in  (6)  reflects  \z\  in  (5).  If  we  let 


(b) 


=  ^h'{i,N,;j,N2)  =  ^h'{i,Ni-,-j,N2  +  Mf2) 
Kii,NvJ,N2)  =  ^K'ii,Nuj,N2)=^K'ii,Ni;-j,N2  +  M/2) 


(-) 

(8) 


then  the  sum  in  (4)  is  simply  double  the  sum  in  (6),  .so  that  if  Ni;j,  N2)  and  K{i,  N\;  j,  N2)  satisfy 
(4),  then  h'{  i,  N\\j,  N2)  and  K'{i,  N\\i,  N2)  satisfy  (6).  Note  that  the  second  equalities  in  (7)  and  (8)  will 
hold  on  a  polar  raster,  but  are  not  required  in  (4).  For  convenience  we  continue  to  refer  to  N2) 

in  (4)  as  the  covariance  function. 

Similarly  to  the  approach  used  in  [14],  we  decompose  the  update  procedure  into  two  steps  by  intro¬ 
ducing  an  interpolated  (auxiliary)  system.  As  shown  in  Fig.  1,  between  every  pair  of  points  in  the  same 


radial  direction,  we  insert  an  auxiliary  point.  The  covariance  function  K(i,N\;j,N'2)  is  interpolated  at 
these  auxiliary  points  such  that  the  Toeplitz-plus-Hankel  structure  (see  (13), (14))  is  maintained.  Then 


the  interpolated  system  is  defined  as 


1  1  ^  1 
A  (i.  .Vi;  j .Vj)  =  fi(f,  + -,  A2)  +  ^  /^(t, -Vj;  n,  jY3)A  (n,  iV3;_7 -f A''2)  (9) 

^  n=-(i- 1)N3=1  ^ 

for  interpolation  at  half-integer  values  of  j  and 


•-2  M 


11  11  2  1  1 

X'(^  +  -.^\■Ji+::),N,)  =  hii  +  -,N^;ji+-),N2)+  ^  M»>^,Ari;n,A3)iir(n,A^3;y(+-),^2) 


n=— (»— ■t)  ^3—1 


for  interpolation  at  half-integer  values  of  i.  Note  that  in  (10)  j  can  also  take  '^n  half-integer  values. 


B.  Derivation  of  the  Basic  Levinson- Like  Recurrence 


Define  the  discrete  wave  operators  Ar  and  Ag  by 

A. /( ,Vi ; A’2  )  =  /( t  ^ ,  -Vi ,  j,  A^2 )  +  /( t  -  ^ ,  iVi ;  j,  iVj)  -  /(»,  Ai  ;  j  +  ^  7V2 )  -  /(i,  A, ;  j  -  ^ ,  Aj  )  ( 1 1 ) 
^ef(i.yi;j,N2)  =  f[i  -  .Vi  -f-  l;j,  A2)  +  /(:  -  Ai  -  l;j,  Aj) 

-  f{i  ~  A,;j,.V2  +  1)  -  /(:  -  Ar,  j,  A2  -  1)  (12) 

o2  o2 

where  Ar  and  A$  can  be  regarded  as  discrete  versions  of  the  continuous  operators  (^  -  ■^)  and 
~  for  the  radial  part  and  transverse  part,  respectively.  In  (12)  Ai  ±  1  and  N2  ±  1  are  computed 
modAf,  reflecting  their  definition  as  angular  variables  on  a  polar  raster. 

We  assume  that  the  covariance  function  has  Toeplitz-plus-Hankel  block  Toeplitz-plus-Hankel  struc¬ 
ture,  defined  bv 


ArA  (i,  Ai;>,  A2)  =  0 


AeA'li,  Ai;  j,  A2)  =  0 


Applying  the  Laplacian  operator  A  =  Ar  -b  A^  to  the  equation  (4),  we  have  after  some  algebra 


‘-i  M 


AA'(i.  A,;j,  A2)  =  0  =  A/z(i,Ai;j,A2)+  Yi  E  Nj)K{n,  NyJ,  N,) 


n=— (i-i)  /'<3— 1 
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‘''^11  1 

+  E  +  O  -  -  O '  •'^3 )  '  -^1 ; »  -  1-  ^’3)]A^(i  -  A'a;  J,  A2) 


\f 

+  E  i  -(»■  -  ^)’  '^’3)  -  ^1;  -(»■  -  1)’  ^3)]A'(-(z-  -  ^),  Ns;j,  N2) 

N3  =  1 

t-1  M 

+  E  E  hii,N^-,n,N3)ArK{n,N3;j,N2) 

n=-(.-l)  N3  =  I 
’-2  M  . 

T  E  E  “  o’^»’"’^3)A5A"(n,7V3;j,jV2)  (15) 

n=-(.-i)/V3  =  l  ^ 

The  algebra  required  to  derive  (15)  is  a  generalization  of  the  algebra  in  [14];  the  major  difference  is  that 
there  are  nc  "end  effect"  terms  in  the  sums  over  N3  when  Ag  applied.  This  is  true  since  h{i,  N2)  is 
periodic  with  period  M  in  N\  and  N2,  since  these  indices  represent  angles  on  the  polar  raster. 

Using  (13)  and  (14)  to  note  that  the  last  two  terms  in  (15)  are  zero,  we  note  that  (15)  has  the  same 
form  as  the  following  linear  combination; 

At  1  1 

E  [i;^(*V,,iV3)A-(f  -  -.A3:j..V2)  +  U-(;V,..V3)A-(-(!  -  -).  Ad;;.  .V.)j 

.V3  =  l 

At 

=  E  .  A3)/i{i  -  ^ 3.  j.  ^2]  +  1,  ( A ].  A3  )/i(  -  ( 1  -  -  ).  A3; ;.  .N2  )] 

,v,  =  i  ^ 

-i  At  At 

+  E  E  E  -  U--^'un..V,)  + r,-(.V,.A'3)/i(-(i-  -).A3:ti.A4)] 

=  .v,  =  l  .v,!  =  l 


X  A  (n,  .v^;;.  .V2) 


where  we  have  defined  the  potentials 


1  ,^(  A 1 ,  .V2 )  —  —  [hi  z  -e  - .  .N| ;  I  -  A'2 )  -  /i(i,  Aj ;  i  —  1,  Ad )]  (17) 

1 I  Ad  .  .\2  )  =  -  [h(  I  E ,  Ad ;  I  -  ^ ),  .V2 )  -  h(  1,  Ad ;  -( «  -  1 ),  .V2 )]  (18) 

.Note  that  on  a  [)olar  raster  we  have  V]'''(,\d,  .N^)  =  l,~(Ad,  A'2  +  A//2).  Since  the  covariance  function 
Alz.Ad: j.^2)  is  assumed  to  be  non-negative  definite,  equation  (-t)  must  have  a  unique  solution.  The 
solutions  to  (15)  and  (16)  must  be  identical,  so  that 

'1  1  1 

A/i(  1 .  .\  1 ; A ; )  =  ^  T  _■*■  ( ,\  1 ,  .\3  ih(  i  -  - .  .N-j; A2 )  +  Vj  ( Ad ,  A3)A( -  ( 1  -  -  ),  A3;;,  A2 )]  (19) 
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Equation  (19)  is  the  ba^ic  recurrence  that  is  the  heart  of  the  Levinson-like  algorithm.  The  left 
side  is  the  difference  of  two  two-dimensional  discrete  Laplacian  operators,  analogous  to  the  difference 
of  one-dimensional  discrete  Laplacian  operators  appearing  in  the  split  algorithms  of  [6].  The  right  side 
generalizes  the  three-term  recurrence  in  [6]  to  a  multi-term  recurrence;  this  is  analogous  to  the  matrix 
recurrence  in  [7].  Ho'/ever,  it  is  applicable  to  non-symmetric  block  Toeplitz-plus-Hankel  systems,  unlike 
that  of  [7].  Writing  out  (19)  explicitly,  we  have 

+h(i  -  +  1)  +  h{i  -  ^,A'i;j.A'2  -  1)  -  Hi  ~  \,Ni  +  l:j,N2)  -  h(i  -  -  l;j,A'2) 

M  .  . 

+  E  [V'^(^Vi,yV3)/i(z--,A'3;j,Af2)  +  V^r(^i<^3)M-^>- o)’^3;j,Af2)]  (20) 

for  all  -(i  -  <  (2  -  |)  and  1  <  Ni,N2  <  M.  Although  we  have  implicitly  treated  i  as  positive 

throughout  the  derivations,  the  recursive  equations  hold  for  negative  i  as  well.  When  i  is  an  integer  and 
j  is  a  half- integer,  equation  (20)  will  update  h  from  the  real  points  to  the  interpolated  points.  When  i 
is  a  half- integer  and  j  is  an  integer,  equation  (20)  will  update  h  from  the  interpolated  points  to  the  real 
points. 

Ill  DERIVATION  OF  THE  SCHUR-LIKE  ALGORITHM 

A.  Derivation  of  the  Schur-Like.  Recurrence 


We  still  need  to  calculate  the  potentials  V("‘'(iVi,  A2)  and  V~{Ni,  N2)  at  the  beginning  of  every  update 
so  that  we  can  use  the  recursive  formula  (20).  To  do  this,  we  introduce  the  Schur  variables  (defined  at 
integer  and  half-integer  points) 

1-1  M 

Mi.\x:j.\2)  =  Nu  j,  N2)  -  Hi.  N^-J,  N2)  -  ^  ^3)A'(n,  A3;  j.  A'i) 

T»=  — (t— 1)  N3  =  \ 

(21) 

where  f'l  .v,  =  0  unless  i  =  j  and  N\  =  .V2,  in  which  case  it  is  unity. 
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Since  the  Schur  variables  are  linear  combinations  of  the  prediction  error  filters  Ni;j,  A'2 ), 

equation  (17)-(20)  show  that  s{i,  N\\j^  N2)  satisfies  the  recurrence  (20),  but  now  for  all  j: 

s(i  +  ^^2)  =  +  ^,N2)  +  s{i,Nt;j  ~  ^,^2)  -  s{i  ~  ^,Ni;j,N2) 

+  s(^  - N2  +  1)  +  s{i  -  AS;i,  N2  -  1)  -  s{i  +  1-j,  N2)  -  s{i  1;  j,  N2) 

iV3=l  ^ 

Equation  (22)  is  the  basic  recurrence  for  the  Schur-like  algorithm.  Note  that  for  — (i  —  1)  <  i  < 
(i  -  1 )  .s(j,  iVi; iV2 )  =  0  by  (4). 

B.  Computation  of  Potentials 


Setting  j  =  (i  -  4)  and  -{i  -  4)  in  (22).  we  have 


M 

I 

.V3=l 


11  11 
^  [V;+( -V, ,  iV3)s(t  -  .¥3; r  -  ^2)  +  V-{NuN3)s{-{i  -  -), ^3; z  -  A2)] 


=  i,.V2)-s(z,/Vi;z,Ar2)  + A9s(z,iV,;i-i,iV2) 


v/ 


x;  [V7(A'..  .V3)^‘*(.  _  1^3;  -(z  -  -^).N2)  +  Vr(A'i,7V3)s(-(z  -  i),N3;  ~{i  -  ^),iV2)] 

,V3  =  1  ^  ^  I  I 

=  .s(z  -  -  ^),^V2)  -  s(z,N,;-i,iV2)  +  Afls(i.Ni;-(i  -  ^),N2) 

Equations  (23)  and  (24)  can  be  written  in  matrix  notation  as 


(23) 


(24) 


V+S++  +  v-s~+  =  X+ 


V+S+-  +  V-§—  =  X' 


(25) 

(26) 


where  we  have  defined  the  M  x  A/  matrices 


=  V;+(A-.,iV2),  (V-]^.,;V2  =  V-{Nr,N2) 


(27) 

(28) 


8 


(29) 


=s{i-^,N,-,±ii~^),N2)-  sii,Nui,N2)  +  Ae3ii,Nu±ii  -  ^),.V2) 

If  the  system  matrix  defined  in  (4)  (written  explictly  in  (44)  below)  is  strongly  non-singular,  i.e.  the 
leading  principal  submatrices  are  all  non-singular,  then  (25)  and  (26)  can  be  solved  in  closed  form  as 

V+  =  (X+-X-(S— )'lS-+)(S++-S+-(S— )-^S-+)-i  (30)  • 

V-  =  (X- -  X+(S++)"^S+')(S"  -  S-+(§++)-^S'*-)-^  (31) 


The  strongly  non-singular  assumption  is  necessary  and  sufficient  for  (25)  and  (26)  to  have  a  unique  ^ 
solution:  the  prooi  of  this  is  a  direct  generalization  of  the  one  in  [14].  A  similar  assumption  is  required 
by  the  standara  multichannel  Levinson  and  Schur  algorithms. 

The  split  Schur-like  algorithm  consists  of  computing  s{i,Ni;j,N2)  by  propagating  (22),  initicdized  % 
using  K(i,Ni  '  /V2).  while  computing  the  V'*(/Vi,iV2)  from  the  s{i,Ni‘,j,N2)  using  (28)-(31). 


C.  Summary  of  Overall  Procedure 

The  overadl  procedure  can  be  summarized  as  follows.  Let  I^ax  be  the  largest  radius  (maximum  radial 
prediction  order).  Then: 

1.  Initialization  of  Split  Schur-Like  Algorithm 


^±^.0  ~  H±i,o  —  Kii,o(I  +  Ko.o) 


-1 


where 


=  A'(±5,A'i;0,  iV2),  [Ko,oj;Vi,N2  =  K{0,  Ni;0,  N2), 


[K±,.o]N..At2  =  A'(±l,iVi;0,A2) 

1  1  ^1 

=  +  Yi  A3; j,iV2)  • 

N3  =  \ 


for  all  j  =  ±^, . . . ,  ±2Imax  and  Ai,  A2  =  1,. . . , M 


M 


3(±1.Ai;j,A'2)  =  hi.^r,j,^2  +  Ki±UNv.j,N2)-hi±l,Ni;j,N2)-  Y  hi±l,  Ni;0,  NsWiO,  NyJ,  N2) 

yv3=i 


for  all  J  =  ±1, . . .  ,±2/mai  and  Ai,  A2  =  1,. . . ,  Af 
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2.  Propagation  of  Split  Schur-Like  Algorithm 

A.  Computation  of  ihe  potentials  -Vi ,  A2)  and  V~ {Ni,  N2): 

Compute  N2)  and  V~{Ni,N2)  from  the  available  5(±(i— i),  N2}  and  s{±i,  Ni;  ±i, 

using  equations  (30)  and  (31); 

B.  Update  the  Schur  variables 

For  j  =  ±(t  +  ^)  To  j  =  ±2/max>  =  1  To  M,  iV2  =  1  To  M,  Parallel  Do 

Update  the  Schur  variables  using  (22). 

End  Parallel  Do  {j,Ni,N2}. 

3.  Propagation  of  Split  Levinson-Like  Recurrence 

A.  Propagate  the  Boundary  Points 

For  .Vi  =  1  To  A/.  N2  =  1  To  A/,  Parallel  Do 

+  =  h{i,Nr-,i-l,N2)-V,-^INi,N2)  (32) 

/i(i  +  |,^i;-(t-^),^V2)  =  h{i,Nu-ii-l),N2)-V-{NuN2)  (33) 

End  Parallel  Do  {^1,^2}. 

B.  Propagate  Non-Boundary  Points 

For  j  =  -{i  -  To  j  =  {i  -  A'l  =  1  To  M,  N2  =  I  To  M,  Parallel  Do 
Update  hii.  Ni',  j,  N2)  using  equation  (20). 

End  Parallel  Do  {./.  Ai ,  ^2}. 

4.  Repeat  steps  2  and  3  from  i  =  1  to  /max  with  increment 

.Note  that  the  Levinson  and  Schur  recurrences  (20)  and  (22)  have  identical  forms,  with  complementary 


supports.  Hence  they  can  be  propagated  in  parallel  using  identical  processors;  this  possibility  was  first 
noted  for  the  one-dimensior.al  case  in  [16]. 


IV  RANDOM  FIELDS  WITH  BLOCK 
TOEPLITZ-PLUS-HANKEL  COVARIANCES 

In  the  above  derivation,  we  have  assumed  that  the  covariance  function  is  already  known.  If  only  a  sequence 
of  two-dimensional  time  series  data  are  available,  there  are  two  methods  for  obtaining  a  covariance 
function  having  the  desired  Toeplitz-plus-Hankel  structure  (13), (14).  The  first  method  is  to  compute  a 
data  covariance  matrix,  and  then  determine  a  symmetric  Toeplitz-plus-Hankel  block  Toeplitz-plus-Hankel 
matrix  close  (in  some  sense)  to  this  matrix.  This  is  a  two-dimensional  Toeplitz-plus-Hankel  generalization 
of  the  well  known  ’’Toeplitzation”  problem  [17].  Some  procedures  for  this  problem  are  suggested  in  [18]. 
The  second  method  is  to  assume  that  the  data  are  generated  by  some  underlying  model,  for  which 
unknown  parameters  may  need  to  be  determined. 

In  this  section  we  focus  on  the  second  approach,  giving  some  specific  examples  of  random  fields  whose 
covariances  satisfy  assumptions  (13)  and  (14).  These  are  merely  illustrative;  there  are  of  course  many 
others.  We  also  note  how  tlie  algorithms  of  Section  II  relate  to  the  continuous-parameter  algorithms  of 
[13]. 


A.  Isotropic  Random  Fields 

For  an  isotropic  random  field,  the  covariance  is  a  function  of  distance  only,  i.e.,  if  x  and  y  are  two 
arbitrary  points  in  the  plane,  then  K(x,y}  —  A'()x  -  yj).  Consider  the  special  case  of  a  isotropic  random 
field  with  covariance  A'(i,y)  =  which  is  often  used  in  image  modeling  [19].  In  polar  coordinates 

on  a  discrete  polar  raster,  this  covariance  function  can  be  represented  as 


K{i,N^;j,N2)  =  /+F-2*3co,(2t(n,~N2)IM) 


~  1  +  ^([(»  +  +  (»  -  j)^]  -  [(»  + J)^  -  (»  -  j)*]cos(2;r(iVi  -  N2)IN)\np 


if  p  s;  1 
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Note  that  the  exponent  has  the  Toeplitz-plus-Hankel  structure  required  by  (13)  and  (14),  and  that  it 
is  not  merely  Toeplitz  in  i  and  j;  hence  the  multichannel  Levinson  algorithm  is  not  applicable.  If  p  %  1, 
the  entire  covariance  satisfies  (13)  and  (14).  Indeed,  any  slowly-changing  function  of  distance  on  a  polar 
raster  satisfies  (13)  and  (14)  in  its  radial  and  angular  arguments. 

B.  Separable  Covariance  Functions 

A  separable  covariance  function  is  one  that  can  be  decomposed  into  multiplication  of  a  function  of 
the  radial  part  and  a  function  of  the  transverse  part,  i.e.,  the  covariance  function  K{i,  Ni;  j,  N2)  can  be 
expressed  as 

A'(i,  A2)  =  Rii,j)  X  TiNi,N2)  (35) 

for  some  functions  R  and  T.  This  type  of  covariance  function  satisfies  (13)  and  (14)  as  long  as  both  R 
and  T  have  Toeplitz-plus-Hankel  structure.  Examples  include: 

1.  2-D  Discrete  Wiener  Process 

The  2-D  discrete  Wiener  process  on  a  polar  raster  can  be  defined  as 

I  M 

^i,Ni  ~  'y  ^  ^  ^j,n  >  ^0,Ni  ~  0  (3fi) 

;=:On=l 

where  „  is  a  zero-mean  discrete  white  noise  field  with  variance  cr^.  Its  covariance  function  is  equal  to 

K{i,Ni;j,N2}  =  =  Mcr'^min{iJ) 

=  iW  (37) 

Note  that  R(i,j)  has  Toeplitz-plus-Hankel  structure  and  T{Ni,N2)  is  a  constant  function. 

2.  2-D  Circularly  Symmetric  Markovian  Random  Field 

In  a  first-order  2-D  circularly  symmetric  Markovian  random  field,  the  output  is  a  uniformly  linear 
combination  of  the  previous  ’’shell”  of  data  plus  white  noise,  i.e. 


If  io,n  is  assumed  to  be  zero  for  all  n,  and  the  variance  of  tUi.jVi  is  equal  to  tr^,  then  the  covariance  function 


IS 


=  1^1“'"'' -  (39)  ^ 

Again,  has  ToepUtz-plus-Hankel  structure  and  T{N\,N2)  is  a  constant  function.  In  the  limit 

a  — ♦  0  (39)  reduces  to  (37). 

C.  Relations  ^tth  Continuous  Algorithms 


It  is  instructive  to  examine  the  continuous-parameter  limits  of  some  of  the  equations  of  this  paper.  Let 
the  intervals  between  points  be  6r  in  the  radial  direction  and  ^  radians  in  the  transverse  (angular) 
direction.  Introducing  a  radial  weighting  factor,  as  discussed  earlier,  and  taking  limits  as  Sr  and  Sg  go  to 
zero  result  in  the  following  transformations: 

1.  The  discretized  Wiener-Hopf  equation  (6)  becomes  the  Wiener-Hopf  integral  equation  (5); 

2.  becomes  a  continuous  two-dimensional  impulse  function,  dominating  the  other  terms  in 
the  definition  (21)  of  the  Schur  variables,  so  that  (30)  and  (31)  may  be  replaced  with  V'*'  « 
and  V~  %  X".  Using  this,  equation  (29)  becomes 


d  d 

V{x,ei;02)  =  -(^  +  ■^)s{x,0i;y  =  1,^2) 


(40) 


where  x  and  y  are  continuous  radii  and  9i  and  02  are  continuous  angles.  Equation  (40)  has  the 
form  of  (4- 17b)  of  [13].  Similarly,  the  continuous  version  of  (13)  has  the  form  of  (4-2)  of  [13]; 

3.  Equation  (15),  with  its  difference  of  discrete  two-dimensional  Laplacian  operators  on  the  left  side, 
is  clearly  analogous  to  (A^  =  Laplacian  with  respect  to  i) 

r2w 

(Ax  -  Ay)/»(i,fli;y,«2)  =  /  Vix,0i',03)h{x,03\y,02)  d03  (41) 

Jo 
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which  is  the  two-dimensional  form  of  (4-1)  of  [13].  However,  (41)  is  NOT  the  continuous  limit  of 
(15)  with  radial  weighting,  since  —  (^  +  which  is  not  the  radial 

part  of  the  2-D  Laplacian.  On  the  other  hand,  =  (^  -f  f^)/(i),  which  is  the  radial 

part  of  the  3-D  Laplacian.  This  shows  that  the  results  of  [13],  derived  for  the  continuous  3-D  case, 
do  not  apply  exactly  to  the  2-D  case  (as  do  the  results  of  this  paper); 

4.  The  algorithms  of  this  paper  require  the  differences  of  the  radial  parts  and  transverse  parts  of  the 
Laplacian  of  the  covariance  to  be  separately  zero:  (13)  and  (14)  must  be  separately  zero.  However, 
in  the  continuous  limit,  we  have  h{i,  N^)  a  h{i  -  ^,Ni-,n,Nz),  and  the  last  two  sums  in  (11) 
may  be  combined.  Then  it  suffices  for  the  sum  (Ar  -I-  A$)A^(t, iVi;  j,  A^a)  =  0,  rather  than  (13)  and 
( 14)  separately.  This  agrees  with  the  requirement  (Ai  —  Ay)K{x,y)  =  0  for  the  algorithms  in  [13]. 

D.  Application  to  Discretized  Continuous- Parameter  Problems 

We  can  draw  some  important  conclusions  from  these  observations.  If  the  algorithms  of  this  paper  are 
being  used  to  solve  the  discretized  version  (6)  of  the  Wiener-Hopf  equation  (5),  then  : 

1.  Equations  (30)  and  (31)  may  be  replaced  with  the  approximations  V'*'  ss  X"*"  and  V~  »  X~  ; 

2.  By  the  chain  rule,  any  continuous  function  of  the  distance  between  two  points  will  satisfy  (13)  and 
( 14),  since  the  square  of  the  distance  itself  does.  Hence  the  algorithms  may  be  used  for  any  isotropic 
random  field.  Note  in  particular  that  (32)  becomes 

K{i,Nuj,N2)  =  (42) 

and  — »  1  as  — *  0; 

3.  Conditions  (13)  and  (14)  may  be  replaced  with  the  more  general  condition 
{Ar  +  Ae)K(i,Ni;j,Nt)  =  0. 

.Numerical  studies  have  shown  that  approximation  (4)  gives  very  good  results  for  6t  ~  0.001,  but  approx¬ 
imation  (6)  is  much  more  sensitive  to  non-infinitesimal  ^r- 


V  COMPLEXITY  AND  GENERAL  TOEPLITZ-PLUS-HANKEL 

SYSTEMS 

.4.  Computational  Complexity 

We  determine  the  number  of  Multiplications-And-Divisions  (MADs)  needed  to  solve  (4)  up  to  order 
i  =  Imax-  Although  some  current  DSP  chips  can  perform  multiplications  as  quickly  as  additions,  the  fact 
remains  that  multiplication  is  a  more  complex  operation  than  addition.  Also,  the  computationtd  savings 
in  the  number  of  additions  is  similar  to  that  for  MADs,  although  we  omit  details. 

The  initialization  of  the  Levinson-like  recurrences  requires  2  M  x  M  matrix  inversions  and  A  M  X  M 
matrix  multiplications,  or  2(^  -f-  4-  4A/^  MADs.  The  initialization  of  the  Schur-bke  recurrences 

requires  8/max  M  x  M  matrix  multiplications  ,  or  8/moxAf^  MADs.  Each  Schur-like  recursion  update  of 
s(i,Ni-,j,N2)  from  i  to  i  -t-  ^  requires  16(/max  -  MADs.  Computation  of  the  potentials  requires  4 
M  X  M  matrix  inversions  and  6  M  x  M  matrix  multiplications.  Finally,  updating  h{i,  Ni]j,  N2)  from  i 
to  i  -f  I  in  the  Levinson-like  recurrence  requires  4(2i-|-  1)M^  MADs.  The  total  number  of  multiplications 
needed  to  solve  (4)  up  to  z  =  Imax  is 

V/3  kf2  tmax  1x3  1^2 

4A/3  +  2(  Y  +  Y)  +  8/maxA/^  +  2  ^  [16(/max  '  i)M^  +  (4(—  +  — )  +  6M3)  -h  4(2z  -b  1)A/^] 

1=1 

=  24/LxA/'  +  Imaxi^^  +  4A/2)  +  4M^  +  2(^  -f  (43) 

This  can  be  seen  to  be  MADs  if  /max  >>  Af  >>  1.  Solution  of  (4)  using  Gaussian 

elimination  would  require  =  0{I^ax^^)  MADs.  Hence  the  savings  in  MADs  over 

Gaussian  elimination  for  large  /max  S’lid  Af  is  a  factor  of  order  /moiAf. 


B.  Comparison  with  Reformulation  as  a  Block- Toeplitz  System 


In  [20]  .Merchant  and  Parks  noted  that  a  ToepUtz-plus-Hankel  system  of  equations  can  be  reformulated 
as  a  block- Toeplitz  system  of  equations  with  2x2  blocks.  Although  no  multichannel  generalizations  were 


discussed  in  [20],  it  is  not  difficult  to  show  that  a  system  of  equations  in  which  the  system  matrix  is 
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the  sum  of  a  block- Toeplitz  matrix  and  a  block-Hankel  matrix,  where  the  blocks  are  M  X  M ,  can  be 
reformulated  a£  a  block-Toeplitz  system  of  equations  with  2M  X  2M  blocks.  This  could  then  be  solved 
using  the  multichannel  Levinson  algorithm.  VVe  now  compare  this  approach,  which  we  call  the  generalized 
Merchant-Parks  procedure,  to  the  algorithm  of  this  paper. 

If  the  generalized  Merchant- Parks  procedure  is  used  to  solve  (4)  up  to  order  i  =  Imai-  the  number 
of  MADs  required  is  -I-  +  2A/^),  since  2M  x  2M  matrices  are  being  multiplied  and 

propagated.  Hence  if  /max  >>  M  >>  1  the  algorithm  of  this  paper  requires  roughly  ^  as  many  .M.ADs 
as  the  generalized  Merchant-Parks  procedure;  for  large  M  this  can  be  quite  significant.  If  M  -  1  the 
algorithm  of  this  paper  reduce  to  that  of  [14],  which  requires  roughly  75%  as  many  .M.A.Ds  as  the  original 
Merchant- Parks  procedure  [20]. 

On  the  other  hand,  the  algorithm  of  this  paper  requires  that  the  system  matrix  be  block  Toeplitz- 
plus-Hankel  with  Toeplitz-plus-Hankel  blocks,  while  the  generalized  Merchant- Parks  algorithm  does  not 
require  the  blocks  to  have  special  structure.  Thus  the  generalized  Merchant-Parks  algorithm  requires 
more  computation,  but  solves  a  more  general  problem. 


C.  Solution  of  Arbitrary  Toeplitz-plus-Hankel  Block  Toeplitz-plus-Hankel  Systems 

Equation  (4)  can  be  written  as  the  following  Toeplitz-plus-Hankel  block  Toeplitz-plus-Hankel  system: 

r  [  I  +  ■ .  •  K_,, 


0  —I 

-I  H_,  0 


K.,_,  •  ■  I  +  K.,. 


S.._,  0  •••  0 

0  0  S_i,, 

where 

=  h(±i,Ni\±j,N2),  j  =  -{i-  l),...,(t-  1),1  <  Ni,N2  <  M 
=  H{j,Nx\fN2),  jj  =  -(»  -  1), . . . ,(i  -  1),  1  <  Ni,N2  <  M 


(44) 


(45) 

(46) 
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[S±i.±i]Ni,/V2  =  s{±i,  Ni;±i,  Ni),!  <  A^i,A^2  <  M  (-17) 

In  (44)-(47)  I  is  the  M  x  M  identity  matrix  and  0  is  an  M  x  M  matrix  of  zeros.  Conditions  ( 13) 
and  (14)  are  equivalent  to  requiring  that  the  system  matrix  in  (44)  be  block  Toeplitz-plus-Hankel  with 
Toeplitz-plus-Hankel  blocks. 

In  this  section  we  solve  a  Toeplitz-plus-Hankel  block  Toeplitz-plus-Hankel  system  of  equations  having 
the  same  system  matrix  as  (44),  but  with  an  arbitrary  right  side.  This  system  is 


I  + 


X_.  •••  X._,  X. 


I  +  K.,. 


=  B_. 


B_(,_i)  •••  B, 


where  the  right  side  is  arbitrary.  Recall  that  the  algorithms  of  this  paper  do  not  require  the  system 
matrix  to  be  symmetric.  To  find  the  solution  X(=  [X_,-, . . . , X,]),  note  that  from  the  definition  (21)  of 
.s(  i,  ,V, :  .V2)  we  have 


I  + 


I  +  K.,. 


where 


S-(i-m-H),-i  ■■■  ^  — +  m-Hl)  ^  m-l-l  ® 


Hm  —  1  j,  •  •  •  1  Hi—tn+l  it  —  m »  t  ^ 

. . . , . . .  ,0] 


Equation  (44)  is  a  special  case  of  (49)  with  m  =  1. 
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Assume  that  all  of  the  central  submatrices  of  the  system  (44)  are  non-singular.  Then  the  unique 

solution  to  (48)  can  be  expressed  as  a  linear  combination  of  =  —i, . . .  ,i  by  using  (49) 

1  i 

X  =  Y1  Xj  =  E  C/H,,,).  (.52) 

m=— i,m^O  /=  — • 

Here  Cm  can  be  found  by  equating  the  linear  combination  (52)  to  (48),  for  1  <  j  <  (j— •  1)  ; 

j-i 

C—jS—j^—j  -|-  CjSj__j  =  — (B_j  4-  ^  ^  (53) 

n=-(i-l) 

j-i 

C-jS-jj  -j-  CjSj  j  =  —  (Bj  4-  C„S„j).  (54) 

n=-(j-l) 

The  overall  procedure  is  as  follows.  Compute  the  H,j  and  S,-,j  using  the  Levinson-like  and  Schur-bke 
algorithms.  Next,  recursively  compute  C±j  in  increasing  j  by  solving  the  2M  X  2A/  systems  (53)  and 
(.54).  Finally,  compute  X  using  (52). 

The  procedures  in  (52-54)  require  roughly  MADs,  which  for  /moi  >>  M  >>  1  dominates 

the  24/mai-^^^  MADs  that  is  the  dominant  tenn  in  the  number  of  MADs  (43)  required  by  the  basic 
algorithm.  For  an  arbitrary  right  side,  the  generalized  Merchant- Parks  algorithm  requires 
MADs.  Thus  the  algorithm  of  this  section  requires  only  as  many  MADs  when  Imax  >>  M  »  1. 

VI  CONCLUSION 

New  fast  algorithms  for  solving  the  discrete  two-dimensional  Wiener-Hopf  equation  on  a  polar  raster  when 
the  covariance  function  has  Toeplitz-plus-Hankel  structure  ha  j  been  derived.  Since  we  have  performed 
explicitly  discrete  derivations,  instead  of  just  discretizing  the  continuous  versions  [13],  the  algorithms 
do  not  require  fine  discretization  or  closely- spaced  points;  if  adjacent  points  are  close  enough,  then  the 
algorithms  reduce  to  the  continuous  case  [13].  In  particular,  the  proposed  fast  algorithms  make  full  use 
of  the  Toeplitz-plus-Hankel  structure  of  the  covariance  function,  so  that  the  overall  computational  com¬ 
plexity  is  only  MADs,  as  opposed  to  MADs  for  the  generalized  Merchant-Parks 

algorithm  discussed  in  the  paper  and  0{I^ax^^)  MADs  for  Gaussian  elimination.  These  algorithms  are 
also  highly  parallelizable,  making  them  even  more  favorable  in  a  vector /parallel  processor  environment. 
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The  snioothmg  filler  for  estimating  the  points  inside  the  disk  of  observations  can  be  computed  from 
the  prediction  filters  using  a  generalized  discrete  Bellman-Siegert-Krein  identity,  as  was  done  for  the 
one-dimensl jnal  continuous  case  in  [?].  The  overall  complexity  is  reduced  compared  with  Gaussian 
elimination.  I’his  is  considered  in  the  separate  paper  [8]. 

Unresolved  issues  include  mapping  oi  this  algorithm  into  optimal  array  processor  architectures,  the 
numerical  stability  of  the  algoiithm,  and  practical  applications  of  this  algorithm  in  problems  such  as 
image  restv'ration  and  coding.  Preliminary  results  on  these  issues  have  been  encouraging. 
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0  denotes  the  interpolated  point 
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APPENDIX  D 


.4.  Basic  Prcblcm 


The  problem  considered  is  as  follows.  From  noisy  observations  of  a  zero-mean  real¬ 
valued  discrete  random  field  the  points  {i,N,M)  inside  a  sphere,  compute  the  linear 

least-squares  estimate  of  for  all  points  on  the  edge  of  the  sphere.  Here  i  is  an  integer  radius 

from  the  origin,  and  N  and  M  are  the  integer  indices  of  the  arguments  (angles). 

The  observations  are  related  to  the  field  Xi,N,M  by  Vi.NM  —  F  where 

is  a  zero-mean  discrete  white  noise  field  with  unit  power,  and  and  are 

unrorrelated.  The  covariance  of  A/},  =  K{i,Ni,Mi',j,N2,  M2),  'is  assumed 

to  be  a  non-negative  definite  function  with  Toeplitz-plus-Hankel  structure  shown  in  (6)  and  (7). 
The  estimates  of  at  the  edge  of  the  sphere  are  computed  from  the  observations  {yi,N,M} 

using 

1-1  N  M 

=  H  HhNi,Mi;j,N2,M2)yj,NiM7  (1) 

J  -:0yV3  =  l  A/2  =  1 

The  optimal  prediction  filters  h(i,Ni,Mi;j,N2,M2)  are  computed  by  solving  the  three-dimensional 
discrete  Wiener-Hopf  equation 

A/i;j,  N2,M2)  =  h{i,Ni,Mi;j,N2,M2) 

t-I  /V  M 

+  E  E  E  h{i,  Nx ,  Mi;  n,  N3,  M3)K{n,  N3,  M3;  j,  IV2,  Mj)  (2) 

n=  — (t— 1)  yV3  =  l  M3  =  1 

for  all  -{i  -  1)  <  j  <  i  -  1,  1  <  Ni,N2  <  N  and  1  <  Mi, M2  <  M.  The  goal  is  to  derive  a 
fast  algorithm  for  solving  (2)  when  K{i,Ni,Mi',j,N2,M2)  has  the  Toeplitz-plus-Hankel  structure 
shown  in  (6)  and  (7)  below. 

B.  Derivation  of  the  Levinson-Like  Recurrence 

Define  the  discrete  wave  operators  A,,  and  by 

Arf(i,Ni,Mi;j,N2,M2)  =  /(i  +  ■^,Ni,Mi;j,N2,M2)  +  /(»  -  ■^,Ni,Mi;j,N2,M2) 
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(3) 


-  fii,  N^Muj  +  i  N2,  AI2)  -  f{i,  Nx,Mi,j-  i  iVa,  M2) 
Aef{i,N,,Mi-,j,N2,M2)  =  f{i  -  +  l))uMi-,j,N2,M2)  +  -  l))uM^■J,N2) 

-  f{i  -  \,N„Mi-,j,iiN2  +  1))i,M2)  -  fii  -  ^,Ni,Mx;j,i{N2  -  l))i,M2)  (4) 

A^/(i,iVi,Mi;j,7V2,M2)  =  fii  -  i,iVx,((Mi  +  l))2;i,^2,M2)  +  /(i  -  i,^i,((Mi  -  1))2;  j,^2) 

-  fii  -  \,NuMi-,j,N2,iiM2  +  1))2)  -  fii  -  \,Nr,Mi-J,N2,iiM2  -  1))2)  (5) 

where  Ar  ,Ae  and  A^  can  be  regarded  as  discrete  versions  of  the  continuous  operators  (gf"  ” 
9^)’  ~  (a^  “  a^)  radial  part  and  transverse  parts,  respectively,  and 

((■))i(2)  means  the  mod  iV(M)  operation.  To  save  space,  we  will  omit  the  ((•))  in  the  following 
derivations.  We  assume  that  the  covariance  function  has  the  Toeplitz-plus-Hankel  structure  that 
satisfies  the  following  forms 

ArKii,NuMi-,j,N2,M2)  =  0  (6) 


(As  +  A^)Kii,Ni,Mi;j,N2,M2)  =  0 


(7) 


Applying  the  Laplacian  operator  A  =  Ar  +  As  4-  A^  to  the  equation  (2),  we  have  after  some 
algebra 


hii  +  i, iVj,  A/i;i, iVj, M2)  =  hii, Nt, Mi  J  +  i  Ar2, M2)  +  /i(i, N,,Mi;j  -  iVj, M2) 

-hii  -  i,  Wi,Mi;;,iV2,M2)  +  /»(»  -  +  1,M2)  +  h(x  -  ^,Ni,Mi-,j,N2  -  I, M2) 

~  2’^i  +  l)Mi;j,iV2»M2)  -  hii  -  -,Ni  -  l,Mi;  j,  iV2,M2)  +  /»(x  -  -,Ni,Mi\j,N2,M2  +  1) 

+hii  --^,NuMx-,j,N2-l,M2-l)-  hii-~,Ni,Mi  + 1;  j,  iVj.Ma)  -/»(*  l;i,  N2,  M2) 

N  M  .  , 

+  E  E  iy^(^uMi;N3,M3)hii--,N3,M3;j,N2,M2)+V-iNi,Mi;N3,M3)hi-i+-,N3,M3-,j,N2,M2)] 


^3=1  Af3  =  l 


(8) 
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for  all  -(t  -  f )  <  i  <  (t  -  1  <  Ni,N2  <  N  and  1  <  Mi, M2  <  M.  Here  we  have  defined  the 

potentials 

V,^{Ni ,  Ml ;  N2M2)  =  -[Hi  +  ^,Ni,Mi;i-^,N2,  M2)  -  Hh  Ni ,  Mi ;  i  -  1,  N2,  M2)]  (9) 

V-iNi,Mi-,N2,M2)  = -[Hi  +  \,Ni,Mu-i  +  ^,N2,M2)-Hi^Ni,Mi--i+l,N2M2)]  (10) 

C.  Derivation  of  the  Schur-Like  Recurrence 

VVe  still  need  to  calculate  the  potentials  Afi;  iVj,  A/2)  and  V~{Ni,Mi;N2,M2)  at  the 

beginning  of  every  update  so  that  we  can  use  the  recursive  formula  (8).  Since  an  inner  product  is 
a  bottle  neck  in  a  parallel  processing  environment,  we  overcome  this  difficulty  by  introducing  the 
Schur  variables 

S{i,  Nl,  Mi\ j,  N2,  M2)  =  -  HhNi,Mi-,j,N2,M2) 

1-1  N  M 

-  E  E  E  Hi.Ni,Mi-,n,Nz,Mz)K{n,N3,Mz-,j,N2,M2)  (11) 

n=-(i-l)/V3  =  l  Af3  =  t 

where  ^i,Ari,Mi:j,yv2,M2  =  0  unless  i  =  j,  Ni  =  N2  and  Mi  =  M2,  in  which  case  it  is  unity. 

Since  the  Schur  variables  are  the  linear  combinations  of  the  prediction  error  filters  bi,NiMi  -,],N2M2  ~ 
h(i,Ni,Mi;j,N2,M2),  equations  (8)-(ll)  show  that  s{i,  Ni,Mi;j,N2,  M2)  satisfies  the  recurrence 
(8),  but  now  for  all  j: 


5(j  +  ^,^'\,Mi;j,N2,M2)  =  s{i,Ni,Mi;j  +  ^,N2,M2)  +  3{i,Ni,Mi-,j  -  ^,N2,M2) 

-sii  -  ^,Ni,Mi;j,N2,M2)  +  «(«  -  Afi;j,iV2  +  I, M2)  +  3{i  -  ^,Ni,Mi;j,N2  -  1,M2) 

-s(i  ~  ^•■^1  +  l;^i',j,^2,M2)  -  3(i  -  ~  ^y^iijj^2,M2)  +  s{i  -  Ni,  Mi;  j,  N2,  M2  +  1) 

+s(i  -^,Ni,Mi;j,N2-l,M2-l)-s{i--,Ni,Mi  +  l;j,N2,M2)-s{i--,Ni,Mi  -  l;j,N2,M2) 

N  M  ,  1 

+  E  E  [y,''{^x^Mi;N3,M3)s{i--,N3,M3;j,N2,M2)^V-{Ni,Mi;N3,M3)9{-i^-,N3,M3;j,N2,M2)\ 

Af3  =  l 

(12) 
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Equation  (12)  is  the  basic  recurrence  for  the  Schur-like  algorithm;  for  — (i  —  1)  <  j  <  (*-  1), 


s{uNx,Mi-,j,N2,M2)  =  0hy  (2). 

Setting  j  =  (i  -  and  -(i  -  in  (12)  respectively,  we  can  solve  for  V'-'*'  and  V~  using  the 
following  matrix  equation 


(13) 


where  we  have  defined  the  N M  x  NM  matrices 


[Sf*]L..L,  =  5(±(t-|),iVi,Mi;±(i-i),iV2,M2)  (14) 

^  ^  • 
[Vi"]l.i.l5.=  ^."=(^i>A/i;iV2,M2)  (15) 

(SflLiX,  =  s{i  -  ^,Ni,Mi-,±{i-  ^),N2,M2)  -  s{i,Ni,Mi-,±i,N2,M2) 

+  (A«  +  A<^)s(t-i  iVi,Mi;±(»-i),JV2,Ai2)  (16)  • 

and  L\,L2  are  related  to  Ni,Mi,N2,M2  by 


Li  =  {Ni  -  1)M  +  Ml 


(17) 


li  =  (TV,  -  1)M -I- M2 


(18) 


for  all  1  <  TVi,TV2  <  TV,  1  <  Mi,M2  <  M,  and  1  <  h,  t,  <  NM 
D.  Summary  of  Overall  Procedure 

The  overall  procedure  can  be  summarized  as  follows.  Let  be  the  largest  radius  (maximum 
radial  prediction  order).  Then  for  all  1  <  TVi.TV,  <  TV  and  1  <  Mi, M2  <  M: 

1.  Initialization 


Compute  h{±\,  Ni, Mi\Q,  N2,  M2),  /i(±l,TVi,Mi;0,TV2,  Af,)  using  (2). 

Compute  a( ± ^ ,  TVj ,  Ml ;  j,  TV2 ,  M2 ) ,  j( ±  1 .  TVi ,  Ml ;  i,  TV, ,  M2 )  using  ( 1 1  Uor  all  j  =  ±  1 , . . . ,  ±2Imax  ■ 
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2.  Propagation  of  Split  Schur-Like  Algorithm 


A  Computate  the  potentials  V''*'(;Vi,A/i;iV2,M2)  and  V,  ( Aj ,  Mi ;  jVj,  j\/2)  by  solving  the 
matrix  equation  (13); 

B  Update  the  Schur  variables  using  (12)  for  j  =  ±(i  +•  §),  •  ■  - ,  ±2/mox- 


3.  Propagation  of  Split  Levinson- Like  Recurrence 
A.  Propagate  the  Boundary  Points: 


h(t  f  :Vi,  Mi; .  -  .V2,  M2)  =  flit,  1,  iVj,  M2)  -  V;+(JVi,  M,;  ^2,  M2)  ( 19) 


h{i  +  ^,iVi,Mi;  -i  -I-  ^,iV2,M2)  =  +  l,N2,M2)  -  V,  (Ai,  Mi; -V2,  M2)  (20) 


B. Propagate  Non-Boundary  Points: 

Update  h(i.Ni,Mi;j,N2,M2)  using  equation  (8)  for  j  =  -(i  -  ^)  to  j  =  [i  -  ^). 

4.  Repeat  steps  2  and  3  from  i  =  1  to  with  increment 


The  overall  procedure  is  similar  to  the  that  for  the  2-D  algorithm,  except  that  now 
there  are  two  instead  of  one  angular  variables  needed  to  be  propagated.  The  number 
of  MADs  required  is  roughly  which  is  far  less  than 

MADs  required  for  the  generalized  Merchant-Parks  algorithm  and 
MADs  for  Gaussian  elimination.  Hence,  the  computational  savings  in  the  3-D  case 
are  even  more  significant  than  those  in  the  2-D  case.  Furthermore,  these  algorithms 
are  also  highly  parallelizable,  making  them  even  more  favorable  in  the  parallel  pro¬ 
cessor  environment. 
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Abstract 

New  fast  algorithms  for  linear  least-squares  smoothing  problems  in  one  and  two  dimensions 
are  derived.  These  are  discrete  and  multidimensional  generalizations  of  the  Bellman-Siegert-Krein 
resolvent  identity,  which  has  been  applied  to  the  continuous,  one-dimensional  stationary  smoothing 
problem  by  Kailath.  The  new  equations  relate  the  linear  leaist-squares  prediction  filters  associated 
wiiii  discrete  random  fields  to  the  smoothing  filters  for  those  fields.  This  results  in  new  fast 
algorithms  for  deriving  the  latter  from  the  former.  In  particular,  used  in  conjunction  with  recently- 
developed  generalized  one  (two)  dimensional  split  Levinson  and  Schur  algorithms  for  covariances 
with  (block)  Toeplitz-plus-Hankel  structure,  these  algorithms  can  be  used  to  compute  smoothing 
filters  for  random  fields  defined  on  a  polar  raster,  using  fewer  computations  than  those  required  by 
previous  algorithms. 
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I  INTRODUCTION 


In  tomographic  imaging  problems  solved  by  filtered  back- projection  [1],  and  in  spotlight  synthetic 
aperture  -adar  [2],  data  are  acqui’^^d  on  a  polar  raster  of  points,  rather  than  on  a  rectangular  lattice. 
Although  it  is  possible  to  interpolate  from  the  polar  raster  to  a  rectangular  lattice,  it  is  clearly 
preferable  to  deal  with  the  data  as  it  is.  This  is  particularly  true  if  the  data  are  noisy,  and  smoothing 
is  required. 

Regarding  the  data  as  a  random  field  with  a  known  covariance  function,  linear  least-squares  smooth¬ 
ing  may  be  performed.  Computation  of  the  smoothing  filter  requires  solution  of  two-dimensional  dis¬ 
crete  ..ormal  equations  in  polar  coordinates.  Fast  algorithms  for  solving  these  equations  are  desirable 
when  the  covariance  has  some  structure.  However,  properties  such  as  stationarity  are  not  manifested 
as  block-Toeplitz  structure  when  the  random  field  is  defined  on  a  polar  raster.  For  example,  the 
covariance  of  an  isotropic  random  field  on  a  rectangular  lattice  is  a  Toeplitz  function  of  the  abscissae 
and  ordinates,  while  on  a  polar  raster  it  is  a  Toeplitz-plus-Hankel  function  of  the  radii. 

Kailath  [.3]  has  noted  the  applicability  of  the  Bellman-Siegert-Krein  (BSK)  resolvent  identity  to 
smoothing  problems  for  continuous  one- dimensional  stationary  random  processes.  First,  the  prediction 
filter  for  the  process  is  computed,  using  the  continuous-time  Krein- Levinson  equations,  or  by  direct 
solution  of  the  Wiener- Hopf  integral  equation.  Then  the  BSK  identity  is  used  to  compute  the  smoothing 
filter,  which  is  the  Fredholm  resolvent  to  the  integral  operator  associated  with  the  covariance  function. 
This  approach  has  been  extended  to  continuous-time  close-to- Toeplitz  covariances  [4]  and  continuous- 
parameter  isotropic  random  fields  [5],  although  the  latter  uses  a  Fourier  expansion  into  one-dimensional 
processes. 

In  this  paper  we  generalize  Kailath’s  approach  in  three  ways:  (1)  from  continuous  time  to  discrete 
lime,  resulting  in  an  algorithm  directly  applicable  to  real  discrete  data;  (2)  from  one  dimension  to  two 
dimensions,  without  requiring  an  assumption  of  isotropy  or  an  initial  Fourier  expansion;  and  (3)  from 
stationary  to  non-sta  lionary  random  fields. 


Although  the  new  algorithms  of  this  paper  do  NOT  require  the  covariance  function  to  have  special 
structure,  they  are  most  useful  when  used  in  conjunction  with  fast  algorithms  for  computing  the 
prediction  filters  that  DO  require  and  exploit  special  structure  in  the  covariance  function.  These 
include  the  Levinson  algorithm  [6]  for  stationary  one-dimensional  random  processes,  the  algorithm  of 
[7j  for  non- stationary  one-dimensional  random  processes  with  Toeplitz-plus-Hankel  covariances,  and 
the  algorithm  of  [8]  for  two-dimensional  random  fields  on  a  polar  raster  with  Toeplitz-plus-Hankel 
structure  in  the  radial  and  angular  variables  of  the  covariance. 

The  paper  is  organized  as  follows.  Section  II  derives  the  algorithm  for  computing  the  smooth¬ 
ing  filters  from  the  prediction  filters  for  one- dimensional  random  processes.  Section  III  derives  the 
corresponding  algorithm  for  two-dimensional  random  fields  on  a  polar  raster.  Section  IV  discusses 
computational  complexity,  and  compares  the  proposed  algorithms  to  other  algorithms  for  comput¬ 
ing  the  smoothing  filters.  We  also  note  how  the  discrete-time  equations  of  this  paper  reduce  to  the 
continuous-time  equations  of  [3]  and  [9].  Section  V  concludes  with  a  summary. 

II  DERIVATION  OF  THE  1-D  SMOOTHING  FILTER 

A.  The  Basic  Problem 


The  smoothing  problem  considered  in  this  section  is  as  follows.  Given  noisy  observations  {y*,  —M  <  k  <  M} 
of  a  zero-mean  real-valued  random  process  {xjt},  compute  the  linear  least-squares  estimate  of  for 
each  k  using  all  of  the  observations.  The  observations  are  related  to  the  process  by  yk  =  Xk  A  Uk, 
where  {n*}  is  zero-mean  discrete  white  noise  with  unit  power  uncorrelated  with  {xfc}  (white  noise 
with  arbitrary  power  can  easily  be  handled  by  scaling).  The  covariance  function  Jt,j  =  E[xiXj]  of 
{xjt}  is  known,  and  is  assumed  to  be  positive  semi-definite. 

The  linear  least-squares  estimate  i,  of  Xi  based  on  {yk, -M  <k<  M)  can  be  expressed  as 


M 


*•=  E  Siiy, 

j=-M 


:i) 


where  the  superscript  M  for  denotes  that  the  range  of  the  data  is  from  to  y^  ■  Using  the 


2 


orthogonality  principle  of  linear  least-squares  estimation,  the  smoothing  filters  can  be  computed 
by  solving  the  discrete  normal  equations 

M 

H  for  -  M  <  i,j  <  M  (2) 

n=— M 

In  the  special  case  when  i  =  M  +  1,  equation  (2)  becomes  the  discrete  Wiener-Hopf  equation 

M  M 

+  =  9m+Ij  "h  9M+l,n^n,j  ~  /or  —  M  <  j  <  M  (3) 

where  hjvf-i-ij  =  9m+i,]  prediction  filter.  The  hij  are  assumed  to  have  been  already  computed, 

presumably  using  some  fast  algorithm  such  as  those  of  [6],  [7],  or  [8].  Our  objective  is  to  derive  a 
recursive  formula  for  computing  the  smoothing  filters  g^  from  the  previously  computed  prediction 
filters  hi^j. 


B.  Derivation  of  the  Algorithm 


0  =  (sf!,*'  -gfj  )+  Z  1(5,"*'  -  5."„)*„,,  +  +  5,"tM+l)*-(M+»,i  (“) 


Writing  (2)  with  M  replaced  by  M  -f  1  and  subtracting  (2)  gives 

M 

z 

n=-M 

Inserting  (3)  in  (4)  results  in 

M 

z 

n=—M 


0  =  -  4)  +  ^  -  g^JK,, 


M 

Z 

n=—M 


M 

Z 

n=-Af 


and  reordering  (5)  gives 

E  i9T-9^Jk^ 


M 


'nj 


%=-M 

M 


n-=—M 

Since  the  covariance  function  is  positive  semi-definite  by  assumption,  -b  ifc,j  is  positive 
definite,  and  the  solution  to  any  system  of  equations  with  system  matrix  consisting  of  j  -f  lr,,j  must 


3 


be  unique.  Therefore,  we  have 


(5*;*'  -  pi?)  =  -[pS?., for  all  -  M<  i.j  <  M  (7) 


Equation  (7)  allows  to  be  computed  recursively  from  and  the  prediction  filters 

Note  that  and  /i±(jvf+i).j  may  be  computed  in  parallel. 

C.  Computation  of  Boundary  Points 

In  order  to  use  (7),  the  boundary  points  inust  be  computed  first.  This  can  be  done  as 

follows.  Setting  j  =  ±(M  +  1)  in  (2),  we  have 

M 

E 

n=-M 


n=-M 

M  M 

+  [*;-(M+l),(M+l)  -  E  ^-(M+l),n*n,(A/+l)]ff,>(A/+i)  =  ”  E  9i,n^n,{M+l)  (8) 

n=-M  n=—M 

M 

[^(M+1),-(M+1)  -  E  ^(M+l),nK,-(M+l)]9^{M+i) 
n=-M 

M 

+  [1  +  A:_(m+1),-(M+1)  -  E  ^-(Af+t),n*n.-(Af+l)]5i^'(M+l)  =  ,-(Af+l)  “  E  9i,nK-(M-\-\)  (9) 


M 

E 

z-M 


These  equations  can  be  written  as  a  2  x  2  matrix  equation  for  each  of  the  unknown  ff,^^jvf+i)- 

1  +  fc(M+l),(M+l)  -  JLn=z-M  ^(M+l),n*^n,(Af+l)  *-(W+l).(A^+l)  ~  Hn=-M  ^-(M+l),n*n,(M+l) 

*{M+1),-(A/+1)  -  lln=-M  /‘(M+l),n^n,-(M+l)  1  +  *'-(M+l),-(Af+l)  “  'Z.n=-M  /‘-(A/+l),n^n,-(M+l)  J 


_M+1 

■ 

.  i'i,-(Af+l)  . 

L 

-(M+l)<i<(M  +  l)  (10) 


^in=—\f  9i,n^n,(M+l) 

(A/+1)  ~  Cn=:— M  (A/+1)  J 

where  we  have  used  the  identities  h^M+t),n  =  5(M+i).n  *-(Af+i).n  =  9-{M+i),n-  Note  that  the 

system  matrix  in  (10)  is  independent  of  i. 


D.  Summary  of  1-D  Algorithm 
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Given  the  data  {y*}  in  the  interval  [—L,L\,  the  entire  algorithm  for  computing  the  smoothing 
filters  may  be  summarized  as  follows: 

1.  Initialize  using  ■  /i.j  for  all  -(|t|  -  1)  <  j  <  |il  -  1. 

2.  Given  -M  <  i,j  <  M,  update  to  as  follows: 

(a)  Compute  the  boundary  points  and  by  solving  the  2x2  system  (10). 

(b)  For  each  i  and  j,  -M  <  i,j  <  M,  compute  from  using  (7).  If  kij  has  special 

structure,  compute  hij  in  parallel  using  a  fast  algorithm  (e.g.,  those  of  [6]  or  [7]). 

(c)  Continue  for  M  =  |t|  —  1  to  L. 

Ill  DERIVATION  OF  THE  2-D  SMOOTHING  FILTER  ON  A 

POLAR  RASTER 

A.  The  Basic  Problem 

Now  we  consider  the  smoothing  problem  for  a  two-dimensional  random  field  defined  on  a  polar 
raster,  whose  points  lie  along  radial  lines  in  2N  angular  directions  (see  Fig.  1).  The  problem  considered 
is  as  follows.  Given  noisy  observations  {yi,fc,0  <  i  <  M,1  <  k  <  2A^}  of  a  zero-mean  real-valued 
discrete  random  field  at  the  points  {i,k)  of  a  polar  raster  on  a  disk,  compute  the  linear  least- 

squares  estimate  of  for  each  {i,k)  using  all  of  the  observations.  Here  the  first  subscript  denotes 
radial  distance  from  the  origin  and  the  second  subscript  denotes  angular  position  (k  corresponds  to 
the  angle  2‘rtkf2N). 

The  observations  are  related  to  the  random  field  by  yj,*  =  Xj,*  -|-  where  {n,,*}  is 

a  zero-mean  two-dimensional  discrete  white  noise  with  unit  power  uncorrelated  with  {xi,jt}  (white  noise 
with  arbitrary  power  can  easily  be  handled  by  scaling).  The  covariance  function 
of  {x,,jb}  is  known,  and  is  assumed  to  be  positive  semi-definite. 

From  Fig.  1,  it  is  clear  that  the  point  {i,k)  =  {—i,k±N);  in  the  sequel  the  point  {i,k),N  +  1  <  k  <  2N 
will  be  denoted  by  (-»,!: -iV).  The  linear  least  squares  estimate  Xi,/v,  ofx,,^^,  based  on  {yj.VfO  <j<M, 
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1  <  Jb  <  2iV}=  {yj,k,  —M  <  j  <  M,1  <  k  <  N}  can  be  expressed  as 

M  N 

^  ^  ^  ^  9iJ^i-J,N'2yj,N2 

j=-M  N2=1 

where  the  smoothing  filters  satisfy  the  two-dimensional  discrete  normal  equations 


(11) 


M  N 

^iJ^i  J,N2  =  9^Ni-,j,N2  +  XI  9^Si;n,N3^n,N3;j,N2  “  1  <  Ni,N2  <  N  (12) 

n=-MN3=l 

A  radial  weighting  n  can  be  introduced  into  the  double  sums  in  (11)  and  (12)  by  replacing  A:,,Ari;j,7V2 
and  9^!\f^  .j  njj  with  '/^9^Ni-3  N2  '  allows  the  algorithm  to  be  applied  to  a  dis¬ 

cretized  two-dimensional  integral  equation. 

B.  Derivation  of  the  Algorithm 

The  derivation  is  identical  to  that  for  the  one-dimensional  case,  since  the  angular  sum  is  unaffected 

by  the  increase  of  the  radial  sum  from  M  to  M  +  1.  The  result  is  (compare  to  (7)): 

N  N 

~  9^Ni-,j,N2  =  ~[  XI  S'.\Ni!M+l,Af3^W+1.^3U.N*  +  XI  (13) 

N3  =  1  N3  =  1 

for  all  -  M  <  i,j  <  M,1  <  Ni,  N2  <  N 

Here  the  fii,Niy,N2  =  ll*®  two-dimensional  prediction  filters.  The  could  be 

computed  recursively  in  parallel  with  (13),  using  the  fast  algorithm  of  [8]. 


C.  Computation  of  Boundary  Points 


As  before,  we  need  to  compute  the  boundary  points  prior  to  using  (13).  Setting 

j  =  ±(M  -h  1)  in  (12)  results  in  the  equations  (compare  to  (8)  and  (9)) 

N  N  M  N 

XI  9’^N^\m+\,N3^Mj^\,N3\M-^1^2~  9i^NuM+l,Nii  XI  XI  f^M+l,Nf,nJ^3hnJ^3-,M+l,N2)] 

N3  =  i  Nt=l  n=-MN3  =  l 

N  N  M  N 

+  [  XI  XI  9^N3]-(M+i),nS  XI  XI  ^-(M-H)JV4;n./y,*n.N3:(M-H).N2)] 

N3  =  1  7V«=1  n=-M  N3=1 


M  /V 

=  *iJV,;(A/-H).yV3  -  XI  ^  9i,Ni;n,N3^n,N3;(M+l),N2 

n=-Af  ^3=1 


(14) 


I 
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and 

N  N  M  N 

(  ^  9^Ni]M+l.N3^M+i.^3;-{M+l),N3  ~  X)  ^^f-(A/+l),JV4(  XZ  XZ  ^A^+l.^'ti'»A3^n,N3:-(M+l),Ar2)] 

Ai'3  =  1  N4=1  n=— M  A^3=1 

N 

.  r  (Af+1)  .  ^  (M+1)  , 

A^3=1 

yv  MM 

~  XZ  5!JiiZ-(M+l),N4(  XZ  XZ  ^-(M+l),N4-,n,N3f^n,Ny.-{M+l),N2)] 


Af4=l  n=—M  N3=1 

M  N 

=  ^.•,/Vi;-(M+l),Ar2  “  XZ  XI  5'i!^i:n,yV3^n.-/V3;-(A/+l),yV2  (15) 

n=-M  N3=1 

If  we  define  the  following  N  X  N  matrices  {1  <  Ni^N^  <  N) 

(16) 

=  fc±(A/+l),JV,;±(M+l)^j  (17) 

M  N 

[H^K^JaTi.Nj  =  XI  XZ  ^±{M+l),Ni-,n,N3kn,S3;±{M+l),N3  (18) 

n=-MN3=l 

and  then  define  from  equations  (16)-(18)  the  additional  N  x  N  matrices 

A  =  I  +  K++ -  H+K+,  B  =  K-+-H-K+  (19) 

C^K+--H+K-,  D  =  I  +  K— -H-K-  (20) 

A  M  N 

[R] /Vi,;V2  =  ki,Ni-,{M+l)J^3  “  XZ  XI  9^Ni-,nJ^3f^n,N3-,{M+l),N2  (21) 

n=-M  N3=1 

A  M  N 

[S] iV,,Nj  =  Knu-{M+1)^3  “  XZ  XI  ffi!Nr,nJV3*n,N3;-(M+l),N2  (22) 

n=-MN3=l 

then  equations  (14)  and  (15)  can  be  written  in  matrix  form  as 

G+ A  +  G-B  =  R  (23) 

G+C  +  G-D  =  S  (24) 
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Equations  (23)  and  (24)  are  a  2N  x2N  system  of  equations  for  G"*"  and  G  ;  compare  them  with  (10) 


(for  which  N  =  1).  However,  a  further  simplification  is  possible.  Since  the  system  matrix 
is  the  same  for  each  \  equations  (2.3)  and  (24)  can  be  solved  in  closed  form  to  give 


- 

A  B 
C  D 


G+  =  (R-SD-1b)(A-CD-^B)-^  (25) 

G-  =  (S  -  RA-iC)(D  -  BA-^C)-l  (26) 


independent  of  i. 

Hence  computation  of  the  boundary  points  N-i  *  requires  only  the  inversion  of 

four  N  X  N  matrices  in  (25)  and  (26).  This  is  significant,  since  the  smootning  filters  gilsi  jN^ 
generally  be  computed  for  all  i  and  iVi ,  N2  (we  generally  wish  to  smooth  all  or  most  of  an  image,  not 
just  one  pixel).  This  is  where  our  algorithm  saves  a  significant  amount  of  computation,  as  compared 
with  other  algorithms  (see  below). 

D.  Summary  of  2-D  Algorithm 

Given  the  data  {yi,k^—L  <  i  <  T,0  <  k  <  N},  the  entire  algorithm  for  computing  the  two- 
dimensional  smoothing  filters  may  be  summarized  as  follows: 

1.  Initialize  using  for  ail  -(jij  -!)<><  |il  -  1  and  1  <  Ni,N2  <  N. 

2.  Given  <  i,j  <  M,l  <  Ni,N2  <  N,  update  to  as  follows: 

(a)  Compute  the  boundary  points  and  by  solving  in  parallel  the 

2M  +  1  2N  X  2N  systems  (25)  and  (26). 

(b)  For  each  »  and  j,  —M  <  i,j  <  M,  and  each  Ni  and  Nj,  1  <  Ni,N2  <  N,  compute 

from  asing  (13).  If  ^i,w,y,iVa  tas  special  structure,  compute  h,,7v,y,7Vj  in  parallel 

using  a  fast  algorithm  (e.g.  the  algorithm  of  [8]). 


(c)  Continue  for  M  =  |tj  —  1  to  Z>. 


IV  COMPUTATIONAL  COMPLEXITY 


We  determine  the  number  of  Multiplications-And-Divisions  (MADs)  needed  to  compute  the  smoothing 
filters  from  the  prediction  filters.  We  also  determine  the  total  number  of  MADs  needed  to  compute  the 
smoothing  filters  from  the  covariance  function,  assuming  that  the  latter  has  special  structure  and  a 
fast  algorithm  has  been  used  to  compute  the  prediction  filters.  Although  some  current  DSP  chips  can 
perform  multiplications  as  quickly  as  additions,  the  fact  remains  that  multiplication  is  a  more  complex 
operation  than  addition.  MADs  can  still  be  used  as  a  rough  guide  to  the  computational  complexity 
of  an  algorithm. 

A.  Computational  Complexity  of  the  One- Dimensional  Algorithm 

The  number  of  MADs  needed  to  compute  the  smoothing  filters  from  the  prediction  filters,  given 
data  {j/j, -L  <  j  <  L),  can  be  determined  as  follows.  For  each  t,  updating  the  smoothing  filters  from 
gffj  to  (this  corresponds  to  adding  two  data  points  at  j  =  M  +  1  and  j  =  -{M  +  1))  requires 
6(2M  + 1)4-8  MADs  to  compute  the  boundary  points  5f±^ivf+i)  sum-of-products  computations 

in  (10)),  and  2(2Af  4-  1)  MADs  to  update  the  other  g^j  to  in  (7).  The  total  number  of  MADs 
to  compute  for  one  i  and  all  j  is  thus  I3M=|i|-i{8(2Af  4-  1)  4-  8]  =  S{L^  -  i^)  4-  24L  +  8|t|  4-  16. 
However,  the  total  number  of  MADs  needed  to  compute  gf'j  for  all  i  and  j  is  only  Em=i[4(2M  4-1)4- 
2]  4-  T2^=-l  IZM=|i|-i[4(2M  4-  1)  4-  6]  =  4-  4-  48|Z/  4-  12,  since  the  system  matrix  in  (10)  is 

independent  of  t,  and  thus  need  not  be  re-computed  and  re-inverted  for  each  i. 

In  the  sequel,  we  assume  (for  purposes  of  comparison)  that  £  >>  1  and  i  >>  1.  Then  the  dominant 
terms  in  the  number  of  MADs  are  the  terms  of  highest  order  in  L  and  t.  To  facilitate  comparisons, 
only  these  dominant  terms  will  be  given. 

If  the  covariance  kij  is  To^plitz,  i.e.  {xfc}  is  a  stationary  process,  then  we  have  kij  = 
and  g^j  =  from  (2).  Then  two  of  the  four  sum-of- product  computations  in  the  system  matrix  of 

( 10)  are  redundant,  so  that  computation  of  gfj  for  one  i  and  all  j  requires  only  6(L*  -  »*)  MADs.  Also, 
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Covariance 

Filter  for 

LTZ  or  MP 

L-t-BSK  or  [7]-|-BSK 

Symmetric 

single  point  i 

812 

10L2  -  6i2 

Toeplitz 

all  —L  <  i  <  L 

4Z3 

2§L3' 

Toeplitz-plus- 

single  point  t 

3212  _  gjz 

Hankel 

all  —L  <  i  <  L 

641^ 

Table  1:  Numbers  of  MADs  required  for  some  specific  covariance  functions  to  solve  (2) 


Covariance 

Filter  for 

LWR  or  MP 

LWR-I-BSK  or  [8]-|-BSK 

Block 

single  point  i 

10£2iV3 

(14L2  _  8i2)A'3 

Toeplitz 

all  —L  <  i  <  L 

8L^N^ 

5^13^3 

Block-Toeplitz 

single  point  i 

64£2jv3 

8(L2  _  p)N^ 

-plus-Hankel 

all  -  L  <  i  <  L 

5^13^3 

Table  2:  Numbers  of  MADs  required  for  some  specific  covariance  functions  to  solve  (12) 


since  need  only  be  computed  for  i  >  0,  computation  of  gf'j  for  all  i  and  j  requires  only  half  as  many 
MADs  as  before,  viz.  Furthermore,  the  Levinson  algorithm  (L)  [6]  may  be  used  to  compute  the 

prediction  filters  -L  <  i,j  <  L}  from  kij,  at  a  cost  of  4L^  MADs.  The  Levinson  algorithm  can 
be  propagated  in  parallel  with  our  algorithm,  resulting  in  an  overall  fast  algorithm  for  computing  the 

I 

smoothing  filters  gf'j  from  kij.  If  the  covariance  is  Toeplitz-plus-Hankel,  the  fast  algorithm  of  [7]  may 
be  used  to  compute  the  —L  <  i,j  <  L)  from  the  kij,  at  a  cost  of  24Z^  MADs,  again  in  parallel 
^  with  our  algorithm.  However,  we  no  longer  have  g^  =  g^-  so  the  reductions  in  computation  for 

purely  Toeplitz  covariances  no  longer  apply. 

The  major  alternatives  to  these  procedures  are  the  Levinson-Trench-Zohar  (LTZ)  [10]  algorithm 
I  for  Toeplitz  systems,  and  the  algorithm  of  Merchant  and  Parks  (MP)  [11]  for  Toeplitz-plus-Hankel 

systems.  We  compare  the  numbers  of  MADs  required  by  all  of  these  algorithms  in  Table  1. 
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For  a  Toeplitz  covariance,  it  can  be  seen  from  Table  1  that  if  gfj  for  a  single  point  i  is  desired, 
t.e.,  we  wish  to  compute  a  smoothed  estimate  at  only  one  point,  then  the  LTZ  algorithm  is  superior 
to  ours  for  small  values  of  t,  while  ours  is  superior  for  large  values  of  i.  However,  if  gfj  for  all  points 
-L  <  i  <  L  IS  desired,  i.e.,  we  wish  to  compute  smoothed  estimates  at  all  points  (as  would  generally 
be  the  case),  then  our  algorithm  in  conjunction  with  the  Levinson  algorithm  requires  only  |  as  many 
MADs  for  large  L.  Furthermore,  for  Toeplitz-plus-Hankel  covariances,  our  algorithm  in  conjunction 
with  that  of  [7]  requires  less  than  half  as  many  MADs  to  compute  gl'j  for  a  single  point  i,  and 
as  many  MADs  to  compute  gl'j  for  all  i  when  L  is  large.  Further  savings  are  possible  since  many 
computations  (e.g.,  the  updates  and  the  sum  in  (10))  can  be  done  in  parallel. 

Other  approaches  may  require  still  more  computation,  g^  may  be  updated  to  using  the 

well-known  formula  for  updating  the  inverse  of  a  partitioned  matrix.  However,  this  requires  3Af ^ 
MADs  per  update,  as  opposed  to  the  8(2M  1)  -f  8  MADs  required  by  the  BSK  identity.  Direct 

solution  of  (2)  using  Gaussian  elimination  would  require  ^{2L  +  1)^  +  ^(2L  +  1)*  MADs  for  each  i. 

B.  Computational  Complexity  of  the  Two-Dimensional  Algorithm 

We  now  assume  that  the  observations  are  {yj,k,—L  <  j  <  L,1  <  k  <  N},  so  that  updating  the 
smoothing  filters  from  N2  corresponds  to  adding  a  "shell”  of  2N  data  points  at  radius 

M  +  1.  For  each  t,  computation  of  the  boundary  points  jg,  requires  6(2Af  -b  1)N^  MADs 

for  (16)-(22),  and  4  N  x  N  matrix  multiplications  and  inversions  for  (25)  and  (26).  Updating  the  other 
smoothing  filters  from  requires  2(2Af  +  l)iV^  MADs  for  (13).  Hence  the  number 

of  MADs  needed  to  compute  gt,Niij,N2  from  the  prediction  filters  for  one  i  and  all  j  is  8(L*  — 
while  the  number  of  MADs  needed  for  all  i  and  j  is  Note  that  these  are  the  numbers  for  the 

one-dimensional  algorithm  multiplied  by  N^,  since  all  operations  now  involve  matrices. 

In  the  sequel,  we  assume  (for  purpose  of  comparison)  that  L  »  N  >>  1.  If  the  covariance 
is  block-Toeplitz,  i.e.  Toeplitz  in  i  and  j,  then  the  Levinson- Wigpns- Robinson  (LWR)  [12] 


algorithm  may  be  used  to  compute  the  prediction  filters  at  a  cost  of 

MADs  (recall  that  the  backward  predictors  are  no  longer  the  time- reversed  forward  predictors,  in  the 
multichannel  case).  If  the  covariance  is  Toeplitz-plus-Hankel  in  both  j  and  j  and  A'l  and  N2,  as  it  is 
for  an  isotropic  random  field  on  a  polar  raster,  the  fast  algorithm  of  [8]  may  be  used  to  compute  the 
prediction  filters,  at  a  cost  of  24L^N^  MADs. 

The  major  alternatives  to  these  procedures  are  the  LWR  algorithm  adapted  to  an  arbitrary  block- 
Toeplitz  system,  and  a  matrix  generalization  of  the  Merchants-Parks  procedure  for  block  Toeplitz- 
plus-Hankel  systems.  Results  are  summarized  in  Table  2.  The  savings  are  similar  to  those  for  the  one¬ 
dimensional  algorithms,  except  for  the  even  greater  savings  for  block  ToepUtz-plus-Hamkel  covariances. 
The  reason  for  the  great  savings  here  is  the  efficiency  of  the  algorithm  of  [8],  which  requires  only  24L^N^ 
MADs  to  determine  the  prediction  filters  from  the  covariance  function:  that  is  negligible  compared  to 
8{L^  -  t2)A'3  and  If  L  »  N  »  1. 

C.  Relation  to  Continuous- Parameter  BSK  Identities 

It  is  instructive  to  examine  the  continuous-parameter  limits  of  the  various  equations  of  this  paper. 
Let  the  intervals  between  points  be  6r  in  the  radial  direction  and  69  =  ^  radians  in  the  angular 
direction.  Introducing  a  radial  weighting  factor,  as  discussed  below  (12),  and  taking  limits  as  6^  a-nd 
Sff  go  to  zero  results  in  the  following  transformations: 

1.  The  discrete  normal  equations  (2)  and  (12)  become  Fredholm  integral  equations.  Similarly,  the 
discrete  Wiener-Hopf  equation  (3)  and  its  two-dimensional  counterpart  become  Wiener-Hopf 
integral  equations; 

2.  The  smoothing  filters  become  the  Fredholm  restJvents  to  the  integral  operators  associated  with 
the  covariance  functions; 
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3.  Using  =  9m+i,x  =  equation  (7)  becomes 


^{x,y-T)  =  -(g(x,T;T)h(T,y)  +  g(x,-T;T)hi-T,y)} 

=  -{h{T,x)h{T,y)  +  h{-T,x)hi-T,y))  (27) 

where  g(i,y,T)  is  the  smoothing  function  by  which  an  observation  at  y  in  the  interval  [-T,T]  is 
multiplied  and  integrated  to  compute  an  estimate  at  x.  Equation  (27)  is  the  BSK  resolvent  iden¬ 
tity  (modified  from  [0,T]  to  [-T,T]),  which  was  applied  to  continuous-time  smoothing  problems 
in  [3]: 

4.  Similarly,  the  recursion  (13)  becomes 

||;(|x|e,,|y|e,;T)  =  -  j^5(|xle,,Te';T)h(Tc',|yK)r2de'  (28) 

where  e^^ey  and  e'  are  unit  vectors,  i  =  \x\ex,y  =  \y\ey  and  S  is  the  unit  circle.  Equation  (28)  is 
identical  to  the  generalized  BSK  identity  applied  to  a  multi-dimensional  continuous-parameter 
smoothing  problem  in  [9]; 

5.  Since  (5,  j  becomes  a  continuous-time  impulse,  the  units  in  (10)  and  (19)-(20)  dominate  the  other 
terms.  Hence  the  computations  of  the  boundary  points  (10)  and  (25)-(26)  become,  respectively, 

g{x,y,T)  =  k{x,y)  -  f  g{x,  z-,T)k(z,y)dz  (29) 

Jo 

9(\^\ezAy\ey\T)  =  ki\x\ex,\y\ey)  -  f  f  g(\x\ez,\z\ey,T)k{\z\ez,\y\ey)\z\de,d\z\  (30) 

Jo  Js 

which  agree  with  equations  for  computing  boundary  values  that  appear  in  [3]-[5]  and  [9]. 

Note  that  although  the  discrete  equations  traesferk^  into  the  expected  continuous  equations,  the 
forms  of  the  discrete  equations  are  not  obvious  from  the  continuous  equations. 

V  CONCLUSION 


New  fast  algorithms  for  computing  the  linear  least-squares  smoothing  filters  for  random  processes 
and  fields  have  been  derived.  These  algorithms  relate  the  smoothing  filters  to  the  prediction  filters 
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associated  with  the  same  covciriance.  If  the  covariance  has  special  structure,  fast  algorithms  such 
as  those  of  [6],  [7],  and  [8]  may  be  used  to  compute  the  prediction  filters;  such  algorithms  may  be 
propagated  in  parallel  with  those  of  this  paper.  This  can  result  in  significant  computational  savings. 
However,  it  is  important  to  emphasize  that  the  results  of  this  paper  hold  for  arbilmry  covariances, 
and  do  not  rely  on  the  existence  of  such  fast  algorithms. 

In  the  limit  of  continuous  time,  the  one-dimensional  algorithm  reduces  to  the  BSK  identity,  which 
was  applied  previously  to  smoothing  problems  for  continuous-time  stationary  random  processes  by 
Kailath.  However,  the  algorithms  are  non-trivial  discrete  and  two-dimensional  generalizations  of  the 
BSK  identity.  Since  both  data  and  numerical  computation  are  inherently  discrete  in  nature,  these 
algorithms  constitute  a  significant  step  in  the  practical  application  of  these  smoothing  ideas. 
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FIGURE  HEADING 


1.  The  polar  i aster  on  which  the  two-dimensional  random  field  is  defined  with2A  =  8. 
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(5.7) 


Figure  1:  2-D  polar  raster  with 
0  <=  radius  <=  5;  1  <=  angular  part  <=  8 


» 


I  APPENDIX  F 

W.-H.  Fang  and  A.E.  Yagle,  “Two  Methods  for  Toeplitz-plus-Hankel  Approximation 
to  a  Data  Covariance  Matrix,”  to  appear  in  IEEE  Trans.  Signal  Processing,  vol.  ASSP-40, 

no.  6,  June  1992. 

I 


I 


I 


\ 


I 


Two  Methods  for  Toeplitz-plus-Hankel  Approximation  to  a  Data 

Covariance  Matrix 

Wen-Hsien  Fang  and  Andrew  E.  Yagle 
Dept,  of  Electrical  Engineering  and  Computer  Science 
The  University  of  Michigan 
Ann  Arbor,  Michigan  48109-2122 

Revised  March  1991 


Abstract 

Recently,  fast  algorithms  have  been  developed  for  computing  the  optimal  linear  least-squares 
prediction  filters  for  non-stationary  random  processes  (fields)  whose  covariances  have  (block) 
I  Toeplitz-plus-Hankel  form.  If  the  covariance  of  the  random  process  (field)  must  be  estimated 

from  the  data  itself,  we  have  the  following  problem:  Given  a  data  covariance  matrix,  computed 
from  the  available  data,  find  the  Toeplitz-plus-Hankel  matrix  closest  to  this  matrix  in  some 
sense.  This  paper  gives  two  procedures  for  computing  the  Toeplitz-plus-Hankel  matrix  that 
minimizes  the  Hilbert-Schmidt  norm  of  the  difference  between  the  two  matrices.  The  first  ap¬ 
proach  projects  the  data  covariance  matrix  onto  the  subspace  of  Toeplitz-plus-Hankel  matrices, 
I  for  which  basis  functions  can  be  computed  using  a  Gram-Schmidt  orthonormalization.  The  sec¬ 

ond  approach  projects  onto  the  subspace  of  symmetric  Toeplitz  plus  skew-persymmetric  Hankel 
matrices,  resulting  in  a  much  simpler  algorithm.  The  extension  to  block  Toeplitz-plus-Hankel 
data  covariance  matrix  approximation  is  also  addressed. 
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I  INTRODUCTION 


Some  faist  algorithms  have  recently  been  developed  for  computing  the  optimal  linear  least- 
squares  prediction  filters  for  non-stationary  random  processes  (fields)  whose  covariances  have  (block) 
Toeplitz-plus-Hankel  form  [1,  2,  3].  Often  the  covariance  function  is  not  given  explicitly,  but  must 
be  estimated  from  the  data  itself.  To  utilize  these  fast  algorithms,  the  estimated  covariance  function 
must  have  Toeplitz-plus-Hankel  structure.  The  problem  can  be  posed;  Given  a  data  covariance 
matrix,  computed  from  a  data  sequence,  find  a  Toeplitz-plus-Hankel  matrix  that  is  closest  to  the 
data  matrix  in  some  sense. 

Several  common  random  processes  (fields)  have  (block)  Toeplitz-plus-Hankel  covariance  func¬ 
tions.  For  example,  the  first-order  Gauss-Markov  process 


Xn  =  aXn~i  +  ,  n  >  1,  iQ  =  0,  |a|  <  1, 


(1) 


where  is  discrete  white  noise  with  varicince  <7^,  has  the  Toeplitz-plus-Hankel  covariance  function 

t2 


K(i.j)  =  E[x,Xj\  =  -  ,(al'  -  a'*'*’-'l),  ij  >  0. 

1  — 


(2) 


The  two-dimensionaJ  circularly  symmetric  Markovian  random  field  on  a  polar  raster 

M 

x,j\  =  “  XI  ^'-1"  +  *  >  F  i  <  N  <  M,  xo,^'  =  0,  |a|  <  1, 

n=l 


(3) 


where  (i,27r-jg),  1  <  <  M  are  polar  coordinates  on  the  polar  raster  and  is  two-dimensioncd 

white  noise  with  variance  also  has  the  Toeplitz-plus-Hankel  covariance  (2).  Also,  in  image 
processing  a  two-dimensional  isotropic  random  field  is  often  modelled  [4]  as  having  a  covariance 
function 

E[t  fj  t  V  1  —  I\(i  2w—-  1  27r— i  —  o'^+r^-20co*(25r(/V, -A/jI/M) 


~  1  +  +  2)^  -  ~  ;)^]cos(27r(Ai  -  A2)/lV))lnp  (4) 
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if  p  %  1,  which  hcis  a  block  Toephtz-plus-Hankei  structure.  Clearly,  for  these  and  similar  random 
processes  (fields),  a  Toeplitz-plus-Hankel  structured  covariance  estimate  will  be  much  more  accurate 
than  a  Toeplitz  estimate. 

For  the  special  case  of  a  wide-sense  stationary  random  process,  the  estimated  covariance  matri.x 
is  symmetric  Toeplitz.  The  matrix  minimizing  the  Hilbert-Schmidt  norm  of  the  difference  between 
this  matrix  and  the  data  covariance  matrix  is  found  by  averaging  the  diagonals  of  the  data  covariance 
matrix,  replacing  each  element  being  averaged  by  the  average  [5].  This  is  the  result  of  projecting  the 
data  covariance  matrix  on  the  vector  space  of  all  symmetric  Toeplitz  matrices,  where  the  projection 
is  defined  using  the  Hilbert-Schmidt  inner  product. 

In  this  paper  we  extend  this  approach  to  the  more  general  case  of  Toeplitz-plus-Hankel  matrices, 
following  which  the  algorithms  of  [1,  2,  3]  may  be  applied.  Since  the  subspace  of  symmetric 
Toeplitz  matrices  is  a  subset  of  the  subspace  of  symmetric  Toeplitz-plus-Hankel  matrices,  the 
errors  (in  the  Hilbert-Schmidt  norm  sense)  w'ill  always  be  smaJler  than  the  error  using  only  the 
Toeplitz  approximation.  Unfortunately,  the  method  is  more  complicated  than  simply  averaging 
along  diagonals  as  in  Toeplitz  approximation.  The  basis  elements  of  the  subspace  need  to  be 
computed  using  a  Gram-Schmidt  orthogonalization,  and  there  seems  to  be  no  simple  closed-form 
expression  for  an  arbitrary  element.  However,  if  we  restrict  ourselves  to  the  subspace  of  symmetric 
Toeplitz  plus  skew-persymmetric  Hankel  matrices,  the  optimal  approximation  can  be  easily  derived 
by  simply  averaging  along  diagonals  and  antidiagonals.  Both  methods  are  developed  in  this  paper. 
The  extension  to  approximation  for  block  data  covariance  matrices  is  also  included.  We  do  not 
specifically  address  other  constraints  such  as  positive  definiteness,  although  such  constraints  can 
be  incorporated  into  one  of  the  methods  of  Section  IV,  if  needed. 

This  paper  is  organized  as  follows.  In  Section  II,  we  specify  the  problem,  the  criterion  used, 
and  the  approach  employed.  In  Section  III,  the  optimal  Toeplitz-plus-Hankel  approximation  using 
basis  elements  derived  from  a  Gram-Schmidt  orthogontilization  is  derived.  In  Section  IV.  the 


optimal  symmetric  Toeplitz  plus  skew-persymmetric  Hankel  approximation  using  averaging  along 
the  diagonals  and  antidiagonals  is  derived.  Some  examples  are  also  given  to  demonstrate  the 
procedures.  In  Section  V,  the  results  are  extended  to  block  data  covariance  matrix  appro.ximation. 
Section  VI  concludes  with  a  summary. 

II  PROBLEM  FORMULATION 


A.  Hilbert-Schmidt  Norm 

For  any  two  square  real  n  x  n  matrices  A  and  B,  the  Hilbert-Schmidt  inner  product  and  norm 


are  defined  as 


<  A,  B  >=  Trace[AB^]  ;  l|>l|p  =<  >=EE“; 


.=1 J=1 


The  problem  we  will  deal  with  can  be  posed  as  foUows:  Given  a  data  covariance  matrix  R,  find  the 
Toeplitz-plus-Hankel  matrix  R  such  that  |)i?  —  Rlj  is  minimized. 

The  solution  to  this  problem  can  be  easily  derived  by  projecting  R  onto  the  subspace  of  Toeplitz- 


plus-Hankel  matrices.  A  set  of  matrices  spanning  this  subspace  is 


0  .  0 


|-  :  ;  : 


where  the  2n  ~  1  basis  function  in  (6)  span  the  Toeplitz  matrices,  and  the  2n  -  1  matrices  in  (7) 
span  the  Hankel  matrices. 

B.  Projection  Approach 

If  V,  3  are  given  a  set  of  orthogonal  matrices  then  the  minimum  distance  (norm)  between 

a  matrix  R  and  the  matrix  R  in  the  subspace  spanned  by  is  equal  to  the  distance  between 

m5*^nx  R  its  projection  on  thio  subspace,  t.e.,  if  ((R  -  .R||  is  minimum,  then 
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Consider  the  special  case  where  the  are  the  matrices  in  (6).  Then,  since  the 

span  the  subspace  of  Toeplitz  matrices  and  are  orthogonal  in  the  inner  product  (5),  the  optimal 
Toeplitz  approximation  for  any  matrix  is  to  project  the  matrix  on  this  subspace,  and  this  leads 
to  averaging  along  diagonals  [5].  If  we  extend  the  basis  {Q,}  to  include  basis  elements  for  Hankel 
matrices  as  well,  the  error  metric  \\R  -  fi||  will  clearly  be  less  than  the  error  for  Toeplitz-only 
approximation.  Let  Rt  be  the  optimal  Toeplitz  matrix  approximation  to  R,  and  let  Rth  be  the 
optimal  Toeplitz-plus-Hankel  approximation  to  R.  Then  the  improvement  in  the  error  metric  is 

p-^r//||'  =  p-«T|!'-||^//||'  (9) 

where  Rh  is  the  projection  of  R  on  the  extension  of  the  basis  to  include  Hankel  matrices. 

We  now  discuss  this  basis  extension. 

Ill  OPTIMAL  TOEPLITZ-PLUS-HANKEL 

APPROXIMATION 

A.  Gram-Schmidt  Orthogonalization 

Unfortunately,  while  the  matrices  in  (6)  are  orthogonal,  and  those  in  (7)  are  orthogonal,  the 
union  of  (6)  and  (7)  are  not  orthogonal  in  the  sense  of  Hilbert-Schmidt  norm  defined  in  (.5).  So  while 
(6)  and  (7)  span  the  subspace  of  Toeplitz-plus-Hankel  matrices,  they  are  not  an  orthogonal  basis. 
Hence  the  projection  of  R  can  not  be  computed  by  averaging  along  the  diagonals  and  antidiagonals. 

To  use  the  projection  method,  the  matrices  in  (7)  must  be  Gram-Schmidt  orthogonalized. 
extending  the  orthogonal  basis  in  (6).  If  we  represent  the  matrices  in  (6)  and  (7)  as  and 

{Qt}^=2n  respectively,  then  the  new  orthogonal  basis  functions  Q[  can  be  recursively  computed  by 
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Given  a  data  covariance  matrix  R,  the  desired  Toeplitz-plus-Hankel  approximation  R  can  be 
computed  as  follows: 

1.  Adjoin  the  set  of  2n  -  1  Toeplitz  orthogonal  basis  elements  in  (6)  to  some  additional  Hankel 
orthogonal  basis  elements  computed  using  the  Gram-Schmidt  procedure  of  (lO)-(ll).  This 
yields  a  complete  orthogonal  basis,  say  {Qi,Q2,  ■  ■  ■ ,  Q 471-4}-  (It  is  shown  in  Appendix  A  that 
there  are  4n  —  4  orthogonal  matrices  in  this  subspace.) 

2.  Compute  R  using  (8),  with  k  =  An  —  A.  The  projections  on  the  Toeplitz  matrices  are  found  by 
averaging  along  diagonals.  The  projections  on  the  Hankel  basis  elements  are  found  by  taking 
linear  combinations  of  the  element  of  R  as  follows: 


3.  To  compute  <  R,Q,  >  for  the  Hankel  basis  elements  2n  <  i  <  4n  -  4,  regard  Q,  as  a  stencil. 
Overlay  R  with  0,  and  multiply  each  element  of  R  by  the  element  of  Q,  directly  over  it.  Note 
that  for  each  Qi  at  least  half  of  the  elements  are  zero. 


C.  Example 


Let  R  = 


The  optimal  Toeplitz-plus-Hankel  approximation  R  can  be  computed  as 


R  = 


6-^  1  -m 
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The  Hilbert-Schmidt  norm  of  the  error  for  this  Toeplitz-plus-Hankel  approximation  is  equal  to 
1.  For  Toeplitz-only  approximation,  the  error  norm  is  7.75.  The  reduction  from  ||i?  -  .fix||  =  7.75 
to  ||/?-  Rth\\  =  1  is  due  to  ||.R//||  =  7.68  in  (9).  Note  that  the  elements  of  R  are  very  close  to  those 
of  R.  This  is  not  surprising,  since  for  this  example  n  =  3.  and  there  are  4n  -  4  =  8  basis  functions, 
only  one  shy  of  number  of  the  degrees  of  freedom  required  to  completely  specify  an  arbitrary  3  x  3 
matrix. 

IV  OPTIMAL  SYMMETRIC  TOEPLITZ  PLUS 
SKEW-PERSYMMETRIC  HANKEL  APPROXIMATION 


The  major  computational  complexity  of  the  above  method  lies  in  the  Gram-Scbmidt  orthogonaliza- 
^  tion  procedure.  We  now  show  that  if  we  restrict  ourselves  to  a  specific  class  of  Toeplitz-plus-Hankel 

matrices,  we  obtain  a  much  simpler  algorithm  which  involves  simply  averaging  along  diagonals  and 
antidiagonaJs  of  the  data  covariance  matrix.  This  is  done  in  two  parts:  First,  we  use  a  matrix  iden- 
tity  to  transform  this  special  case  of  the  Toeplitz-plus-Hankel  approximation  into  the  more  familar 
Hermitian  Toeplitz  approximation  problem.  Second,  we  show  that  this  problem  is  equivalent  to 
averaging  along  diagonals  and  antidiagonals. 

A.  Transformation  to  Hermitian  Toeplitz  Approximation 

For  simplicity,  we  only  consider  the  case  where  n  is  even.  Define  /„  as  the  nx  n  identity  matrix, 
and  Jn  as  the  n  x  n  exchange  matrix  with  ones  on  the  main  antidiagonal  .  It  has  been  shown  in 
[6]  that  for  any  n  x  n  Hermitian  Toeplitz  matrix  HT. 


(1-7)/^ 

(1  -t-  i)yn 

(1  +j)/f 

(1  -jUn 

• 

(1  + 

(1  - 

1 

_  (1-j)^^ 

(1  +  ;)/^  _ 

will  transform  HT  into  a  sum  of  real  Toeplitz  and  Hankel  matrices  : 


HT  =  U(HT)U^  =  T  (Toeplitz  matrix)  +  H  (Hankel  matrix)  (13) 

where  T  =  Re[HT]  and  H  =  Im[HT]  ■  J^.  Since  HT  is  a  Hermitian  Toeplitz  matrix,  T  is  a 
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symmetric  Toeplitz  matrix  and  H  is  a.  skew-persymmetric  Hankel  matrix  with  all  zero  elements  oii 
the  main  antidiagonal. 

Since  our  concern  is  to  obtain  the  optimal  Toeplitz-plus-Hankel  appro.ximation.  we  will  reverse 
the  above  procedure.  More  specifically,  given  any  data  covariance  matrix  we  want  to  find  the 
optimal  Toeplitz-plus-Hankel  approximation  R.  where  the  Toeplitz  and  Hankel  matrices  have  the 
same  structure  as  those  of  (13).  Then 


minP-  fill  =  m\n\\U^(R-  fi)r||  =  min\\U^RU  -  IIT\\  (U) 

R  R  HT 


where  HT  =  RU  is  a  Hermiiian  Toeplitz  matrix,  and  we  have  used  the  fact  that  unitary 
transformation  is  a  one-to-one  mapping  that  does  not  change  the  Hilbert-Schmidt  norm. 

VVe  have  thus  transformed  the  problem  from  optimal  Toeplitz-plus-Hankel  approximation  to 
optimal  Hermitian  Toeplitz  approximation,  which  can  be  easily  solved  [7].  .More  specifically,  given 
a  n  X  n  matrix  C  =  [c,j],  its  optimal  Hermitian  Toeplitz  approximation  C  can  be  computed  as 


1  C  -  I 


2{n  -  k)  ^ 


n  -  t  i  1  ■ )  i 


;c=l  '  '  m=l 

where  ♦  denotes  complex  conjugate  and  c^,  =  c'^.  .Alter  the  approximation  (Toeplitzation ).  the 
resulting  Toeplitz-plus-Hankel  approximation  is 


U(HT)U^  =  T+  H 


\  l()i 


The  overall  procedure  to  find  the  optimal  symmetric  Toeplitz  plus  skew-persymmetric  Hankel 
matrix  fi  (Toeplitz-plus-Hankelization )  to  the  data  covariance  matrix  fi  can  be  summarized  as 
follows: 

Given  the  data  covariance  matrix  fi  : 

1.  Perform  forward  transformation  C  =  RU\ 

2.  Perform  Hermitian  Toeplitzation  of  C  —  C  using  (15); 


3.  Perform  inverse  transformation  fi  =  UCU^ 


B.  Example 


Let  R  = 


3  6 

•4  3 

6  » 


J)  U^RU  = 


5+1.5J  l-j  2-2;  3.5-3; 

2  +  25;  2.5  +  0,5;  2.5 -1.5;  3.5-; 

3.5  + ;  2,5+1 .5;  2.5  -  0.5;  2-2.5; 

L  3.5  +  3;  2  +  2;  4  +  ;  5-1.5; 


[ii)  Hermitian  Toeplitzalion  HT  = 


[lu)  r(HT)V^  = 


3.25 

2.83+  1.62; 
2  25  +  1.5; 


2.83  -  1.62; 
3.25 

2.83  +  1.62; 


2.25  -  1.5; 
2.83  -  1  62; 
3.25 


3.5  -  3; 
2.25  -  1.5; 
2,83  -  1.62; 


L 

3.5  +  3; 

2.2.5  +  1.5; 

2.83  +  i.e 

3.25 

1 

j 

■  3.25 

2.o3 

2,25 

3,5  ■ 

■  -3 

-1.5  -162 

0  ■ 

■  0.75 

1.33 

1.08 

3  5  ■ 

2  83 

3.25 

2.83 

2-25 

-1,5 

-  1 .62  0 

1.67 

1.33 

2.08 

2.83 

A.  12 

2.75 

2.83 

3,25 

2.83 

+ 

-  1  62 

0  1  62 

1 .5 

1.08 

2.83 

5.42 

1 .33 

3.5 

2.25 

2.83 

3  25 

0 

1.62  15 

3 

3.5 

4  42 

4.33 

t> .  7  5 

ToepLitz  matri.x 


Hankel  matri.K 


The  Hilbert-Schmidt  norm  of  the  error  for  tliis  Toeplitz-plus-Hankel  approximation  is  equal  to 
4.G.  For  Toeplitz-only  approximation,  the  error  norm  is  8.05.  The  main  reason  for  using  these  trans¬ 
formations  is  that  the  Hermitian  Toeplitzation  problem  can  be  easily  solved  by  simply  averaging 
the  elements  along  diagonaJs.  However,  to  do  th'«.  we  reed  four  complex  matrix  multiplications  for 
the  forward  and  inverse  transformations.  In  the  next  section  we  show  that  this  procedure  reduces 
to  simple  averaging  operations  along  the  diagonals  and  antidiagonals. 

C.  Modified  Projection  Method 
Consider  the  following  (2n  -  \  )  n  y.  n  matrices; 


■  1  -  - 

.  o' 

0 

1 

o' 

0 

1 

0 

1  0 

0 

u  ^ 

1 

1 

0 

0 

-1 

1 

1 

0 

1 

IJ 

...  1  _ 

0 

1  0 

1 

0 

0 

-1 

0  _ 

0 

0 

- 1 

(12) 


We  now  show  that  the  above  matrices  are  mutually  orthogonal  in  the  inner  product  (5).  and 
also  span  a  subspace  in  which  every  element  can  be  represented  as  the  sum  of  a  symmetric  Toeplitz 
matrix  and  a  skew-persymmetric  Hankel  matrix. 
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Theorem  1  The  2n  —  1  matrtcef:  in  (17)  are  mutually  orthogonal,  and  hence  form  a  set  of 
functions  for  a  subspace. 


Proof:  see  Appendix  B. 

Theorem  2  A  inatriT  can  be  represented  as  the  sum  of  a  symmetric  Toeplit:  matrix  and  a  skt  u  - 
persymmetric  Hankel  matrix  if  and  only  if  it  lies  in  the  .subspace  spanned  by  the  basis  functions  in 

mi. 


Proof:  see  Appendix  ('. 

From  the  above  theorems,  the  optimal  approximation  R  of  any  matrix  R  by  the  sum  of  a 
symmetric  Toeplitz  matrix  and  a  skew-persymmetric  Hankel  matrix  can  be  computed  as 


R  = 


n-  1 


!=0 


<  R.T,  >  ^ 

<  r.,r.  >  • 


E 


<  > 


I  ISi 


averaging  along  diagonals  averaging  along  antidiagonals 
Both  the  modified  projection  method  and  the  transformation  method  compute  the  projection 
onto  the  subspace  of  symmetric  Toeplitz  plus  skew-persymmetric  Hankel  matrices,  as  shown  by 
i  1-1).  .Since  this  subspace  is  convex,  the  projection  is  unique.  Hence  both  methods  are  equiv;dent. 
riiis  can  also  be  shown  by  going  through  the  transformation  method  algebraically,  and  showing 
that  the  result  is  the  modified  projection  method. 

D.  Example 


Consider  the  same  R  = 


1 

1 

2 


2  1 

1  3 

2  A 

3  6 


■S  ■ 
6 
3 
8 


'  1 

0 

0 

0  ■ 

■  0 

1 

0 

0  ■ 

‘  0 

u 

1 

0  “ 

_  2  -t-  1  -1-  4  -t-  8 

0 

1 

0 

0 

2-t-3J.3+l  +  24-6 

1 

0 

1 

0 

1  +  6  -f  1  -P  3 

0 

0 

0 

1 

4 

0 

0 

1 

0 

-f* 

6 

0 

1 

0 

1 

A 

1 

0 

0 

0 

0 

0 

0 

1 

0 

0 

1 

0 

0 

1 

0 

0 

■  0 

0 

0 

1  ■ 

■  0 

0 

1 

0  ■ 

■  0 

1 

0 

0  ■ 

5  4-2 

0 

0 

0 

0 

0 

1 

0 

-1 

2  -t-  1  -  3  -  6 

1 

0 

0 

0 

2 

0 

0 

0 

0 

+ 

6 

1 

0 

-1 

0 

4 

0 

0 

0 

-1 

1 

0 

0 

0 

0 

-1 

0 

0 

0 

0 

- 1 

0 

9 


■  1 

a 

0 

0  ■ 

■  0.75 

1.33 

1.08 

3.5  ' 

0 

0 

0 

0 

1,33 

2.08 

2.83 

4.42 

0 

0 

0 

0 

1.08 

2,83 

5.42 

4.33 

_  0 

0 

0 

-1 

3.5 

^.42 

4.33 

6.75 

i 


i 


i 


I 
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This  example  verifies  that  both  the  transformation  method  and  the  modified  projection  method 
produce  the  same  results.  However,  the  latter  method  only  requires  averaging  along  diagonals  and 
antidiagonals,  which  is  much  easier  than  the  matrix  multiplications. 

Incorporation  of  additional  constraints  such  as  rank  constraint  and  positive  definiteness  has 
been  studied  in  the  Hermitian  Toeplilz  case  [8].  These  additional  constraints  can  easily  be  incor¬ 
porated  into  the  Toeplitz-plus-Hankel  case  in  transformation  method,  since  they  are  preserved  by 
the  transformation  (13).  This  is  why  the  transformation  method  was  presented  separately. 

V  OPTIMAL  SYMMETRIC  TOEPLITZ  BLOCK-TOEPLITZ 
PLUS  SKEW-PERSYMMETRIC  HANKEL 
BLOCK-HANKEL  APPROXIMATION 

Block  data  covariance  matrices  occur  in  many  multichannel  and  multidimensional  problems  [9].  To 
utilize  the  fast  algorithm  developed  in  [3]  for  computing  the  optimal  prediction  filters,  we  need  to 
find  an  optimal  block  Toeplitz-plus-Hankel  approximation  to  a  block  data  covariance  matrix. 

A.  Multichannel  Generalization  of  Previous  Results 

We  focus  on  the  symmetric  ToepUtz  block-ToepUtz  plus  skew-persymmctric  Hankel  block- 
Hankc!  case  with  n  p  x  p  blocks,  where  n  is  even.  If  a  matrix  R  has  such  structure,  then  it 
can  be  represented  by 


. 

■  '  ^  — (u—  \) 

R-{-n-l)J 

RqJ 

R  = 

Ro  ■ 

-f- 

RqJ  : 

Itfi-  1 

L 

Ro 

RqJ 

Rn-\j 

svmmetric  Toeplitz  block-Toeplitz  matrix  skew-persymmetric  Hankel  block-Hankel  matrix 

(19) 


I 
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where  Ri  =  =  —R-,^  3,nd  both  R,  and  Ri  are  p  y.  p  Toeplitz  matrices  for  — (n  —!)<(< 

n  -  1.  For  this  type  of  matrix,  we  can  extend  the  unitary  transform  (13)  from  [6]  to  the  following 
multichannel  case: 


u  =  i 
2 


(1  +  » 
(1  +  ])J  (1  -  J  )/..x  p 


(■  -j)/p 


(1  +j)Jp 


(1  —  j)/p  (1  +  jl'^P 
(1  +  (1  —  j)^p 


(1  +  j)Jp 


(1  “  j)^p 


(20) 


and  we  also  have 


U'  =  u-‘  =  - 


(1  +  ?)/nxp  (  1  —  i)J  nxf 
2  2 

(1—  _;)JnX£  (l+jl-flLii 


(21 : 


Then 


U-‘RU  = 


(22) 


Rq  ... 

:  Ro 

Rn-\  •  • ■  Ro 

where  Ri  =  R,  +  jR,.  We  have  R,  =  R^,  by  the  assumption  of  R,  =  RZi,  and  Rt  =  -RZi-  so  that 
the  block  matrix  resulting  from  the  transformation  is  a  block  Hermitian  Toeplitz  matrix. 

The  procedures  for  computing  the  optimal  symmetric  Toeplitz  block-Toeplitz  plus  skew-persymmetric 
Hankel  block-Hankel  matrix  are  the  same  as  before,  except  now  all  the  matrices  become  block  ma¬ 
trices. 

To  avoid  the  matrix  multiplications,  the  projection  method  is  applicable,  with  some  modifi¬ 
cations  of  the  basis  functions.  We  can  represent  the  p  x  p  matrices  in  (6)  and  (7)  respectively 
as 

[f^]Jk  =  4-J-.  ;  [Hi\]k  =  Sk+j-i^p+i-,)  ,  for  all  I  <  j,k  <p,  and  -  {p  -  1 )  <  i  <  p  -  1  (23) 

where  =  1  if  i  =  j  and  =  0  if  z  ^  j.  The  n  symmetric  Toeplitz  matrices  and  the  n  -  1 
skew-persymmetric  Hankel  matrices  in  (17)  can  then  be  respectively  represented  as 


(To],,j  =  S,-j  ;  I  <  i,J  <  n.  /  =  1, . . . ,  n  -  1 


(24) 
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1  <  i.j  <  n,  1—1 


Then  the  basis  functions  for  the  subspace  of  symmetric  ToepLitz  block-Toeplitz  matrices  are 


To3fo;  ToG{fi  +  f_iy,  T.QT, 


for  /  =  1,.. .  ,p-  1;  *  =  1,. . 


n-  1;  j  =  -(p  -  1  (26) 


and  the  basis  functions  for  the  skew-persymmetric  Hankel  block-Hankel  matrices  are 


(J  To)©  (H/  -  H^i);  Hi  O  Hj  for  (  =  1, . . .  ,p  -  1;  i  =  1,. . 


n-  1;  j  =  -(p-  l),...,p-  1  (27) 


where  ©  is  the  modified  outer  product  operation,  defined  as  A  0  B  =  {am.nB}  if  n  >  m.  and 
Wm.n(J BJ))  if  n  <  m,  where  am,n  is  the  (Tn,n)'th  element  of  matrix  A.  Since  aU  the  elements  in 
T,(T, )  and  are  either  0  or  1,  the  outer  product  is  a  simple  operation  in  this  special  case.  These 

two  sets  of  matrices  are  easily  verified  to  be  orthogonal  by  following  the  same  steps  used  in  Theorem 
1.  The  resulting  approximation  for  a  symmetric  Toeplitz  block-Toeplitz  plus  skew-persymmetric 
Hankel  block-Hankel  matrix  is  equivalent  to  projecting  this  block  matrix  on  the  subspace  spanned 
by  the  basis  functions  in  (26)  and  (27),  which  leads  to  averaging  the  diagonal  and  antidiagonal 
blocks,  and  then  the  diagonals  and  antidiagonals  of  each  block. 

B.  Example 


Assume  R  = 


5 

8 

1 

1 

2 

L  4 


4 

2 

.3 

3 

14 

3 


1 

12 

4 

8 

2 

15 


Then,  after  forward  transformation,  block  Hermitian  Toeplitzation,  and  inverse  transformation  (or 


using  the  projection  method  directly),  we  obtain  the  optimaJ  approximation  R  = 


6 

-  4.5 

5.375 

4.25 

6 

2.5 

4.5 

6 

4.75 

5.375 

4.5 

6 

5.375 

4.75 

6 

4.5 

5.375 

4.25 

4.25 

5.375 

4.5 

6 

4.75 

5.375 

6 

4.5 

5.375 

4.75 

6 

4.5 

2.5 

6 

4.25 

5.375 

4.5 

6 

-5 

0.5 

-2 

0.75 

-2.5 

0 

0.5 

-0.5 

0.75 

-1.75 

0 

2.5 

-2 

0.75 

-2 

0 

1.75 

-0.75 

0.75 

-1.75 

0 

2 

-0.75 

2 

-2.5 

0 

1.75 

-0.75 

0.5 

-0.5 

0 

2.5 

-0.75 

2 

-0.5 
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The  fast  algorithm  of  [3]  was  designed  for  linear  prediction  on  a  polar  raster.  Since  covariance 
functions  on  a  polar  raster  are  periodic  in  the  angular  variables,  the  associated  covariance  matrices 


will  have  circulant  blocks.  The  basis  (26)-(27)  should  be  modified  to 


T,  0  To;  T,  0  (Tj  +  for  i  =  0 . n  -  1;  j  =  1 . p  -  1  (28) 

for  the  Toeplitz  block-circulant  matrices  and 

HiQ{J -Tq)-,  H,Q{Hj  +  for  t  =  1,. . .  ,Ti  -  1;  j  =  1 - ,p-l  (29) 

for  the  Hankel  block-circulant  matrices.  These  basis  functions  are  easily  shown  to  be  orthogonal, 
so  the  projection  on  this  basis  can  again  be  found  by  averaging  along  diagonals  and  antidiagonals. 

VI  CONCLUSION 

In  this  paper,  the  well-known  problem  of  ’’Toeplitzation”  of  a  data  covariance  matrix  has 
been  extended  to  Toeplitz- plus- Hankel  approximation  of  matrices.  The  general  solution  can  be 
computed  by  projecting  the  given  data  covariance  matrix  on  the  space  of  Toeplitz-plus- Hankel 
matrices.  Although  the  basis  functions  for  this  subspace  can  be  recursively  generated,  as  the 
size  of  the  matrix  grows  large,  the  Gram-Schmidt  orthogonalization  requires  much  computation. 
To  obtain  a  simpler  algorithm,  we  can  restrict  ourselves  to  the  subspace  of  symmetric  Toeplitz 
plus  skew-persymmetric  Hankel  matrices,  for  which  the  optimal  approximation  can  be  efficiently 
computed  by  averaging  along  diagonals  and  antidiagonaJs.  We  also  show  that  the  same  result  can 
be  achieved  by  a  unitary  transformation  along  with  Hermitian  Toeplitzation;  the  latter  algorithm 
permits  additional  constraints  such  as  rank  constraints  and  positive  definiteness. 

For  the  multichannel  and  multidimensional  problems,  approximation  for  a  block  data  covariance 
matrix  is  also  considered.  The  optimal  symmetric  Toeplitz  block-Toeplitz  plus  skew-persymmetric 
Hankel  block-Hankel  matrix  can  be  derived  either  by  using  the  unitary  transformation  along  with 
block  Hermitian  Toeplitzation,  or  the  more  efficient  projection  method. 
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Prove  that  there  are  only  An  —  ^  matrices  left  in  (6)  and  (7)  after  the  Gram-Schmidt  orthoy- 
onalization. 

Since  the  matrices  in  (6)  and  (7)  span  the  space  of  Toeplitz-plus-Hankel  matrices,  the  number 
of  orthogonal  matrices  in  this  subspace  is  equal  to  that  of  the  linearly  independent  matrices  in  (6) 
and  (7). 

We  use  the  same  notation  as  in  (23)  for  T,  and  Hi,  with  the  replacement  of  p  by  n.  Since 
is  a  set  of  linearly  independent  (also  orthogonal)  matrices,  we  adjoin  j I 

with  the  elements  in  in  the  following  order  :  If  the 

newly  added  element  is  linearly  dependent  on  the  previous  matrices,  then  we  remove  it.  So  the 
number  of  matrices  remaining  form  a  set  of  linearly  independent  matrices. 

If  a  matrix  is  linearly  dependent  with  a  set  of  matrices,  then  we  must  be  able  to  find  a  sequence 
of  lines  such  that  each  non-zero  element  in  these  dependent  matrices  is  crossed  by  these  lines  at 
least  twice.  Note  that  the  non-zero  elements  in  matrices  of  (6)  and  (7)  are  some  specific  lines  in 
either  NE-SW  or  NW-SE  directions.  It  is  easy  to  check  that  if  the  index  j  is  even  (odd),  then  for 
the  above  condition  to  hold  the  elements  (.ff±(n-2))  ^re  always  required. 

The  only  other  possibilities  are  H_(n-2)  Hn-i-  If  n  is  even,  we  have 

^  ^  T.  ;  =  (30) 

•  even  ,  odd  v.  v. 

Reordering  the  terms,  we  get 

^-(n-2)=  L  =  L  (31) 

i  odd  ‘  even.«#-(n-2)  Vt  n-t) 

which  means  that  H_(n-2)  are  linearly  dependent  on  the  other  matrices.  If  n  is  odd. 

€ 

simply  interchange  ’’even”  and  ’’odd”  in  the  above  argument.  Thej^ore,  there  are  2(2n  -  1)  -  2  = 
4n  -  4  linearly  independent  matrices  in  (6)  and  (7).  ■ 

APPENDIX  B 
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Proof  of  Theorem  1 

VVe  use  the  same  notation  as  in  (24)  and  (25)  for  T,  and  Hi.  It  is  easy  to  verify  that  <  T^.T]  >  - 
0  and  <  HmiHi  >=  0  in  the  sense  of  (5)  if  m  I  (see  [5]).  Therefore,  we  only  need  to  show  that 
Tm  and  Hi  are  mutually  orthogonal.  From  (24)  and  (25),  we  have  (  m  ^  0) 

n  n 

<  T^,H,  >=  Trace[TlHt\  =  Y. 

t=l  A:=l 

n  n 

=  EEi<.  -k-m  +  ^k-i-m][h+i-{n  +  l-l)  “  (5A:+.-(n+l+/)] 

1=1  k=l 

n  n 

=  EE  ^i  —  k~m^k+i  —  {n+l  +  l)  “h  — ^_rn^Jk+t  — (n+1  — /)  — i  — +  i  — (n  +  H-/)  “h  — i  — m ^/c+i  — (n  + 1  —  0 

i  =  lA:=l' - ^ - '  ' - - - '  ' - - - '  ' - - ' 

Point  .4  Point  B  Point  C  Point  D 

(32) 

So  the  solution  of  (32)  can  then  be  determined  by  the  intersections  of  these  four  straight  lines,  i.e.. 
i  -  k  =  m,  k  -  i  =  m,  k  +  iz=n-\-l  +  l,  and  k  +  i  =  n+  l-l,as  shown  in  Figure  1.  If  m  yt  0, 
by  symmetry  these  four  lines  either  do  not  intersect  at  all  (A  =  B  =  C  =  Z)  =  0),  or  have  four 
intersections,  for  which  A  =  B  =  C  =  D  =  1.  In  both  cases,  (32)  is  equal  to  zero.  If  m  =  0.  then 
there  are  always  two  intersections  between  the  lines  k+i  =  n+  l  +  l.k  +  i=^n  +  l-l,  and  t  =  k. 
and  the  result  is  still  equal  to  zero.  Therefore,  Tm  and  Hi  are  mutually  orthogonal,  and  these  2n  -  1 
matrices  form  a  set  of  basis  functions.  ■ 

APPENDIX  C 

Proof  of  Theorem  2 

(a)  ’Tf’  part:  Any  vector  in  this  subspace  can  be  represented  as  °-'iPi- 

is  obvious  that  the  first  sum  is  a  symmetric  Toeplitz  matrix,  and  the  second  sum  is  a  skew- 
persymmetric  Hankel  matrix,  (b)  ’’Only  if’  part:  If  a  matrix  C  can  be  represented  as  the  sum  of 
symmetric  Toeplitz  matrix  T  and  skew-persymmetric  Hankel  matrix  H,  then  C  can  be  represented 
as  c  =  E?=i[T’]..r.-,  +  Er=-iM^]i(n  -i)H,.  Hence  this  matrix  lies  in  the  space  spanned  by  the  basis 
functions  of  (17).  ■ 
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Abstract 

A  zero-mean  homogeneous  random  field  is  defined  on  a  discrete  polar  raster.  Given  sample 
values  inside  a  disk  of  finite  radius,  we  wish  to  estimate  the  field’s  power  spectral  density  using 
linear  predi'tion  Issues  arising  here  include  estimation  of  covariance  lags,  and  extendibility  of  a 
finite  set  of  lag  estimates  int'^  a  positive  serai-definite  covariance  extension  (required  for  a  meaningful 
spectral  density).  VVe  give  a  generalized  autocorrelation  procedure  that  guarantees  positive  semi- 
definite  covariance  estimates.  It  first  interpolates  the  data  using  Gaussians,  computes  its  Radon 
transform,  and  applies  familiar  one-dimensional  techniques  to  each  slice.  Some  numerical  examples 
are  provided  to  justify  the  validity  of  the  proposed  procedure.  We  also  propo—  a  correlation 
matching  covariance  extension  procedure  that  uses  the  Radon  transform  to  exteno  a  given  set  of 
covariance  lags  to  the  entire  plane,  when  this  is  possible,  and  discuss  circumstances  for  which  this 
is  impossible. 


I  INTRODUCTION 


In  many  applications,  such  as  tomographic  imaging  problems  solved  by  filtered  back-projection  [l], 
and  spotlight  synthetic  aperture  radar  [2],  data  are  collected  on  a  polar  raster  of  points,  rather  than 
on  a  rectangular  lattice.  To  process  such  data,  e.g.  remove  undesired  frequency  components;  we  need 
to  estimate  the  power  spectral  density  for  data  defined  on  a  polar  raster. 

The  obvious  approach  of  simply  estimating  the  1-D  power  spectral  density  independently  along 
each  slice  will  give  an  incorrect  answer,  since  the  2-D  Fourier  transform  on  a  polar  raster  is  not  given 
by  the  1-D  Fourier  transform  along  each  slice.  One  approach,  the  2-D  periodogiam.  is  to  interpolate 
the  data  onto  a  rectangular  lattice,  and  then  take  the  2-D  Fourier  transform  of  the  resampled  values. 
We  note  here  that  for  a  rectangular  raster,  1-D  spectral  estimation  techniques  have  been  applied,  first 
by  columns,  then  by  rows,  in  some  ‘‘separable"  2-D  spectral  estimators  [3,  4].  While  these  separable 
estimators  do  compute  the  2-D  Fourier  transform  correctly,  they  neglect  correlation  between  rows  and 
columns. 

.A  major  problem  with  the  2-D  periodogram  is  the  poor  resolution  of  spectral  estimates  based  on 
a  small  amount  of  data  [5].  This  is  due  to  truncation  of  the  covariance  lags,  since  only  a  finite  amount 
of  data  sa.mples  is  available.  To  overcome  this  difficulty  in  1-D,  parametric  modeling  is  used  to  e.xtend 
the  finite  set  of  covariance  lags.  Linear  prediction  (AR  modeling)  is  the  most  common  approach  due 
to  its  simplicity  and  high-resolution  spectral  estimates.  New  contributions  of  this  paper  include  the 
following: 

1.  An  "autocorrelation”  2-D  spectrum  estimation  procedure  which  uses  the  Radon  transform  to 
transform-the  2-D  problem  into  an  uncorrelated  set  of  1-D  spectrum  estimation  problems.  It  is  an 
autocorrelation  method  in  that  all  unknown  values  are  windowed  to  zero,  as  in  the  autocorrelation 
method  for  1-D  linear  prediction,  for  computing  the  Radon  transform.  It  differs  from  a  previous 
Radon-based  2-D  spectrum  estimation  procedure  [6]  in  the  following  three  ways: 


(a)  The  Radon  transform  is  computed  in  a  different  manner  that  ensures  a  non-negative  esti- 


mate  of  power  spectrcil  density; 


(b)  'T’ltp  use  of  1-D  linear  prediction  to  obtain  finer- resolution  1-D  spectrum  estimates  along 
each  2-D  slice; 

(c)  Discussion  of  the  effects  of  the  1-D  covariance  extension  along  each  slice  on  the  2-D  covari¬ 
ance  (viz.  correlation  matching  holds  in  the  Radon  transform  domain,  but  not  in  the  2-D 
domain); 

2.  A  new  2-D  covariance  extension  procedure  that  extends  a  set  ol  2-D  covariance  lags  defined  in 
a  finite  disk  to  the  entire  plane,  when  this  is  possible.  Unlike  the  first  procedure,  this  procedure 
has  the  correlation  matching  property  of  preserving  the  given  covariance  lags  in  the  2-D  domain: 

3.  A  discussion  of  various  interpolating  functions  used  to  compute  the  discrete  Radon  transform, 
and  implications  of  their  use  for  2-D  spectrum  estimation. 

,4.  Review  of  2-D  Linear  Prediction  on  a  Rectangular  Raster 

.Many  aspects  of  1-D  linear  prediction  have  been  shown  to  generalize  to  the  2-D  case  defined  on 
a  rectangular  raster  [7].  For  example,  stability  and  minimum  phase  properties  are  still  related  to 
reflection  coefficients  [7].  However,  two  vital  aspects  do  not  generalize  to  the  2-D  case: 

1.  Causality,  which  has  a  clear  definition  in  the  1-D  case,  has  been  defined  in  at  least  two  different 
ways  on  a  2-D  rectan;  -t  raster.  Asymmetric  half-plane  causality  [7]  splits  the  2-D  raster  into 
“past”  and  “future”  half-planes;  the  2-D  AR  model  has  support  in  the  “past”.  Quarter-plane 
causality  [8]  means  that  the  2-D  autoregressive  (AR)  model  has  support  in  a  quarter-plane, 
e.g.  to  the  “southwest”  of  the  present  point.  Since  quarter-plane  causality  is  a  special  case  of 
asymmetric  half-plane  causality,  we  consider  only  the  latter  in  the  sequel. 

2.  An  essential  feature  of  1-D  linear  prediction  is  covariance  extendihility.  Given  a  finite  positive 
semi-definite  (psd)  set  of  covariance  lag  estimates,  it  is  always  possible  to  extend  this  set  into  an 
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infinite  psd  set  of  covariance  lags.  This  is  important  since  a  non-psd  set  of  covariance  lag.s  will 
lead  to  negative  values  in  the  estimated  power  spectral  density.  However,  this  property  does  not 
extend  to  the  2-D  case  on  a  rectangular  raster. 

For  asymmetric  half-plane  causality,  the  region  of  support  for  the  2-D  AR  model  is  infinite,  so  that 
truncation  is  clearly  necessary.  This  truncation  is  the  cause  of  much  of  the  difficulty  in  2-D  linear 
prediction;  it  results  in  a  discontinuous  region  of  support,  and  even  in  the  1-D  case  a  discontinuous 
region  of  support  creates  oroblems.  In  ([7],  p.  59)  a  1-D  example  with  discontinuous  support  results 
in  a  non-minimum  phase  AR  filter  that  does  not  satisfy  the  correlation-matching  property.  Indeed,  a 
unite  set  of  2-D  psd  covariances  with  discontinuous  support  may  not  even  have  a  psd  extension  over 
the  entire  plane  [9]. 

The  cause  of  the  difficulties  can  be  seen  by  examining  the  Yule- Walker  equations  for  determining 
the  AR  filter  coefficients  from  the  covariance  lag  estimates.  These  equations  have  block-Toeplitz  form, 
so  that  the  number  of  covariance  lag  estimates  exceeds  the  number  of  AR  filter  coefficients  (see  [5].  p. 
495  for  a  specific  example).  This  has  two  implications: 

1.  An  infinite  number  of  different  covariance  lag  estimates  can  be  associated  with  the  same  .\R 
model.  Hence  the  correlation  matching  property,  which  guarantees  that  the  spectral  estimate 
will  be  consistent  with  the  finite  set  of  lag  estimates,  no  longer  holds: 

2.  Covariance  extension  from  a  finite  set  of  estimated  lags  requires  recursion  using  the  2-D  .4R 
model,  over  an  asymmetric  half-plane.  Since  the  region  of  support  is  infinite,  and  only  a  finite 
set  of  lags  estimates  are  given,  truncation  is  necessary,  and  this  may  result  in  a  non-psd  covariance 
extension. 

B.  2-D  Linear  Prediction  on  a  Polar  Raster 

In  this  paper  we  address,  for  the  first  time,  similar  questions  for  a  random  field  defined  on  a  polar 
raster.  On  a  polar  raster,  causality  is  defined  unambiguously  in  terms  of  increasing  radius;  the  region 


of  support  for  an  AR  model  at  any  point  on  a  given  circle  is  the  disk  inside  the  circle.  Since  this  disk 
is  a  continuous  region  of  support,  the  result  of  [9]  is  inapplicable. 

Indeed,  we  give  an  explicit  procedure  which  inputs  discrete  sample  values  inside  a  finite  disk,  and 
outp  ts  a  set  of  psd  covariance  lags.  We  call  this  covariance  extension,  although  -trictly  speaking  we 
are  not  extending  a  set  of  covariance  lags,  but  creating  an  extended  set  of  psd  lags  from,  a  finite  set 
of  data.  In  Section  VI  we  propose  another  algorithm  that  explicitly  extends  a  finite  set  of  2-D  lags  to 
the  entire  plane,  provided  this  is  possible. 

In  this  paper,  we  propose  using  the  Radon  transform  to  decouple  the  2-D  spectral  estimation 
problem  into  a  set  of  1-D  problems.  The  projection-slice  theorem  tells  us  that  there  are  two  ways  to 
compute  the  2-D  Fourier  transform:  (1)  we  can  either  compute  it  directly  by  taking  the  2-D  Fourier 
transform;  or  (2)  we  can  take  the  Radon  transform  first,  and  then  apply  the  1-D  Fourier  transform 
along  each  direction  in  the  spectral  domain.  This  suggests  the  following  algorithm  for  2-D  spectral 
estimation:  (1)  take  the  Radon  transform  of  the  data;  (2)  extrapolate  the  1-D  covariance  lags  in  the 
Radon  transform  domain  along  each  direction,  using  i-D  bnear  prediction;  and  then  (3)  superposing 
the  1-D  spectral  estimates  to  form  a  2-D  spectral  estimate,  defined  on  a  polar  raster. 

Note  that  the  available  data  are  discrete  samples,  but  the  projection-sbce  theorem  only  holds  for 
continuous  data,  so  we  need  to  find  some  interpolating  functions  to  interpolate  the  discrete  data. 
Since  the  sampling  theorem  on  a  polar  raster  is  very  different  from  that  on  the  rectangular  lattice, 
the  interpolating  functions  for  a  btuid-limited  signal  are  quite  complicated  [10,  11].  In  this  paper,  we 
propose  using  gaussian  interpolating  functions  to  compute  the  Radon  transform  of  the  given  discrete 
data.  A  complete  discussion  of  the  merits  of  gaussian  vs.  other  interpolating  function  is  also  addressed. 
It  should  be  noted  that  our  proposed  “interpolating”  function  does  not  agree  with  the  original  specified 
discrete  data  points;  indeed  it  should  more  properly  be  termed  a  “defocusing”  function.  To  make  it 
easier  for  the  reader,  we  give  “interpolating  function”  a  definition  slightly  different  from  the  usual;  see 


Section  III. 


This  paper  is  organized  as  follows.  Section  II  proposes  a  psd  covariance  extension  method  using 
the  Radon  transform.  Section  III  discusses  the  choice  of  interpolating  functions  to  transform  the 
discrete  data  samples  into  continuous  data.  The  analytically  explicit  procedure  using  the  gaussian 
interpolating  functions  is  then  given  in  Section  IV.  This  procedure  can  be  used  to  provide  a  high- 
resolution  spectral  estimate  for  points  defined  on  a  polar  raster.  Some  numerical  examples  are  given 
in  Section  V.  In  Section  VI,  we  propose  a  2-D  psd  covariance  extension  technique  that  also  has  the 
correlation  matching  property,  provided  that  a  2-D  psd  extension  exists.  Section  VII  concludes  with 
a  summary. 

II  2-D  LINEAR  PREDICTION  AND  PSD  COVARIANCE 
EXTENSION  ON  A  POLAR  RASTER 

A.  Problem  Formulation 

The  problem  considered  is  as  follows.  A  set  of  data  is  defined  on  a  polar  raster.  We  are  given 
discrete  sample  values  {/(i,m),0  <  i  <  N,l  <  m  <  M}  at  the  points  {i6r,2TrmfM)  on  the  polar 
raster,  as  shown  in  Figure  1;  i  is  integer  radius  from  the  origin,  Sr  is  the  radial  spacing,  and  m  is 
the  integer  index  of  angular  position,  corresponding  to  an  angle  of  27rm/M  radians.  The  goal  is  to 
compute  a  psd  set  of  covariance  lags  everywhere  in  the  plane. 

The  assumption  of  discrete  samples  is  required,  since  any  numerical  procedure  will  ultimately 
require  discretization.  We  point  out  here  that  if  the  data  is  generated  from  an  isotropic  random  field 
which  is  bandlimited  in  wavenumber  to  a  disk  of  radius  tt,  and  M  >  2'kN ,  then  the  discrete  sampled 
points  {/(z,Tn),0  <  t  <  iV,  1  <  m  <  M}  may  be  interpolated  to  give  the  exact  value  of  the  random 
field  everywhere  in  the  disk  of  radius  N  [10]. 

B.  The  2-D  Radon  Transform  and  Projection-Slice  Theorem 

In  order  to  decouple  the  2-D  linear  prediction  problem  into  a  set  of  1-D  linear  prediction  problems 
along  each  slice,  it  is  necessary  to  first  compute  the  Radon  transform  of  the  data.  The  2-D  Radon 


5 


transform  is  defined 


f{t,6)  =  TZ{f{x,y)}  =  J  J  fix.y)6(t  -  xcosO  -  ys'ind)dx  dy  (1) 

so  that  the  Radon  transform  is  the  set  of  projections  or  line  integrals  of  /(i.  y)  along  all  possible  lines. 

An  important  property  of  the  Radon  transform  is  the  projection-slice  theorem,  which  states  that 
the  2-D  Fourier  transform  F{k,0)  in  polar  coordinates  of  f(x,y)  can  be  computed  by  taking  1-D 
Fourier  transforms  along  each  slice  of  the  Radon  transform  of  f(x,y)  so  that  [12] 

F{k,e)  =  FF{fix.y)}  =  (2) 

where  Ft-^k  denotes  a  1-D  Fourier  transform  taking  t  into  wavenumber  k,  and  f{t,9)  is  computed 
using  (1).  A  discrete  version  of  the  projection-slice  theorem  has  been  used  to  develop  a  fast  algorithm 
for  computing  2-D  discrete  Fourier  transforms:  first  the  discrete  Radon  transform  is  computed,  and 
then  1-D  discrete  Fourier  transforms  are  computed  along  each  slice  of  the  Radon  transform  [13].  Since 
both  transforms  are  paraJlelizable,  this  can  save  computation  time. 

For  a  homogeneous  random  field,  it  may  be  shown  that  the  Radon  transform  is  a  whitening  trans¬ 
form:  each  slice  of  the  Radon  trajisform  of  a  homogeneous  random  field  is  uncorrelated  with  each 
other  slice  [14].  This  suggests  that  the  2-D  linear  prediction  problem  can  be  decoupled  into  a  set 
of  independent  1-D  linear  prediction  problems  by  Radon  transforming  the  data.  This  approach  was 
taken  in  [6];  however,  [6]  did  not  consider  the  problems  of  linear  prediction  on  a  polar  raster,  from  a 
finite  disk  of  data,  correlation  matching,  and  psd  covariance  extension. 

C.  Procedure  for  2-D  Covariance  Extension 

Clearly  computation  of  the  Radon  transform  from  the  data  will  require  interpolation.  In  the 
next  section,  we  will  discuss  how  to  choose  an  interpolating  function  to  transform  data  from  the 
discrete  domain  into  the  continuous  domain.  At  present,  for  convenience  we  assume  that  the  data  are 


continuous  and  inside  a  disk  of  finite  radius. 


Since  we  have  data  only  inside  a  disk  of  finite  radius,  we  propose  an  “autocorrelation”  method 
in  which  the  unknown  data  are  windowed  to  zero  for  purposes  of  computing  the  Radon  transform. 
The  Radon  transform  is  then  computed  analytically.  Finally,  the  1-D  autocorrelation  form  of  linear 
prediction  is  used  on  each  slice  of  the  Radon  transform  to  get  a  set  of  psd  covaricince  estimates. 

The  term  “autocorrelation  method”,  in  the  linear  prediction  sense  of  the  term,  is  justified  due  to 
the  following  two  properties  of  the  Radon  transform: 

1.  Let  x{t,4>)  be  the  Radon  transform  of  the  random  field  x(r,6)  (using  polar  coordinates  through¬ 
out).  Note  from  (1)  that  for  any  T  >  0  {x(t,4>),t  >  T}  depends  only  on  {x(r,6),r  >  T).  Hence 
windowing  the  data  to  zero  for  i  >  N  is  equivalent  to  windowing  its  Radon  transform  to  zero 
for  t  > 

2.  Using  (2),  it  is  clear  that 

'^{f{x,y)**g{x,y))  =  f(t,d)  *  g(t,9) 

where  **  denotes  2-D  convolution  and  *  denotes  1-D  convolution  in  t.  Setting  f(x,y)  =  x{r,d) 
and  gix,y)  =  x(r,  -9)  in  (2)  shows  that  the  following  two  methods  are  equivalent: 

(a)  Windowing  the  Radon  transform  of  the  data  to  zero,  and  then  forming  the  covariance  lag 
estimates  from  these  Radon  transforms; 

(b)  Forming  the  covariance  lag  estimates  directly  from  the  windowed  data  (the  autocorrelation 
method  of  linear  prediction),  and  then  Radon  transforming  the  lag  estimates. 

Ill  COMPUTATION  OF  DISCRETE  RADON  TRANSFORM 

In  this  section  we  discuss  the  computation  of  the  Radon  transform  of  a  function  defined  on  a 
discrete  lattice  of  points.  We  call  such  a  transform  a  discrete  Radon  transform.  The  discrete  Radon 
transform  will  be  used  in  the  spectral  estimation  technique  proposed  below.  To  facilitate  comparison 
of  our  method  with  various  other  definitions  of  the  discrete  Radon  transform,  we  consider  first  a 
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rectangular  lattice  of  discrete  points,  and  then  a  polar  raster  of  discrete  points  (the  latter  is  the  actual 
case  of  interest). 

,4.  Rectangular  Lattice 

Consider  a  function  defined  on  a  rectangular  lattice  of  points  (i,  j),  where  i  and  j  are  integers 
such  that  —M<  i,j  <  M  for  some  M.  Our  goal  is  to  define  and  compute  the  discrete  Radon  transform 
of  /,,j  such  that  the  following  properties  hold: 

1.  Computation  of  the  discrete  Radon  transform  requires  as  little  time  and  storage  as  possible: 

2.  The  Radon  transform  possesses  the  projection-slice  property; 

3.  The  Radon  transform  f(t,6)  of  a  psd  discrete  2-D  function  is  psd  in  t  for  each  0. 

Note  that  ease  of  invertibility  of  the  discrete  Radon  transform  is  not  an  issue  here,  since  the 
projection-slice  theorem  states  that  the  2-D  spectrum  on  a  polar  raster  is  immediately  determined 
from  the  1-D  spectra  of  the  Radon  transforms.  Hence  ease  of  computation  of  the  forward  transform, 
not  the  inverse  transform,  is  significant. 

Our  approach  is  to  interpolate  /,_j  into  a  continuous  function  f(x,y),  defined  as 

M  M 

X]  fx.j4>{x  -  i,y  -  j)  (3) 

i=— Af  j=— Af 

where  <f>{x,y)  is  defined  here  as  an  interpolating  function.  The  discrete  Radon  transform  f{t,9)  of  /,,j 
is  then  defined  to  be  the  same  as  the  Radon  transform  of  its  interpolation  /(x.y),  which  is 

M  M 

f{tj)  =  n{f{x,y)}  =  XI  XI  -  i.y  -  j)} 

M  M 

=  X^  XI  fx.jMi  -  icos0  -  j  sin  9,6)  (4) 

t=  —  M  J  —  —  M 

where  72{d>(x,j/)}  =  d>{t,9).  This  definition  clearly  possesses  the  projection-slice  property.  To  follow, 
we  consider  some  common  interpolating  functions,  for  more  other  interpolating  functions,  see  [15]. 
Some  choices  of  interpolating  function  <i>{x,y),  and  the  resulting  discrete  Radon  transforms,  are: 
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1.  Impulses 

Choosing  for  the  interpolating  function  the  2-D  impulse  function  <p(x,y)  =  6(x)6{y)  results  in 

M  M 

f{t,6)=  ^  i  cose  -  j  sine)  (5) 

,=-M  j=-M 

since  7^{6(i)(5(j/)}  =  6{t). 

For  this  choice  of  interpolating  function,  the  discrete  Radon  transform  is  zero  unless  the  line  passes 
precisely  through  a  lattice  point;  hence  f{t,e)  is  zero  except  for  a  finite  set  of  t  and  e  (excluding  values 
found  from  only  a  single  lattice  point).  This  makes  this  choice  unsuitable  for  2-D  spectral  estimation. 

This  is  the  discrete  Radon  transform  defined  by  Beylkin  in  [16].  Note  that  on  an  infinite  2-D  lattice 
of  integers  (M  —>00),  the  set  of  lines  through  the  origin  for  which  the  discrete  Radon  transform  is 
non-zero  is  precisely  the  set  of  lines  with  rational  slopes. 

2.  Square  Pixels 

A  common  method  of  computing  the  Radon  transform  of  a  sampled  function  is  to  assume  that 

represents  the  value  of  the  square  1x1  pixel  centered  at  coordinates  (x,y)  =  {i,j).  The  Radon 
transform  is  then  computed  as  follows.  For  each  line,  multiply  the  length  of  the  line  within  a  pixel  by 
the  value  /,j  of  that  pixel,  and  sum  over  all  pixels  through  which  that  line  passes.  This  method  was 
used  in  [6]  to  compute  the  Radon  transform  for  2-D  spectral  estimation. 

This  pixel  assumption  is  clearly  equivalent  to  using  for  the  interpolating  function  (t>[x,y)  = 
rect{x)'^ect(y),  where  rect{x)  =  1  if  —1/2  <  x  <  1/2,  and  =  0  otherwise.  The  resulting  discrete 
Radon  transform  is  then 

M  M 

/(< >  ^)  =  X]  X  -  i  cose  -  j  sin  e,  e) 

»=— Af  j^—M 
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where  ([12],  p.  62) 
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cos g-l-sin  gt2t  Jf  _  gjj^  ^  _  ^Qg  0  <_  2t  <  Sin  9  -  COS  6 
2  sm  6  cos  6  ’ 

— if  sin  6  -  cos  9  <  2t  <  cos  0  -  sin  0 
^(t,9)  =  TZ{rect{x)rect{y)}  =  <  (6) 

cosa-t-sing-2t  j,  9  -  sin  9  <  2t  <  cos  9  +  sin  9 

2  sin  8  cos  6  ' 

0,  otherwise. 

It  is  clear  that  this  requires  a  considerable  amount  of  computation,  in  violation  of  condition  #1 
above.  It  should  be  noted  that  the  value  of  this  definition  of  discrete  Radon  transform  is  that  its  inverse 
Radon  transform  may  be  computed  by  solving  a  linear  (but  large)  system  of  equations.  However,  this 
is  not  valuable  to  us  in  the  context  of  2-D  spectral  estimation,  since  the  2-D  spectrum  can  be  found 
from  the  1-D  spectrum  immediately  using  the  projection-  slice  property.  Hence  there  is  no  reason  to 
make  the  choice  of  incerpolaiing  functions  implicitly  made  in  [6].  A  more  serious  problem  is  that  there 
is  no  guarantee  that  the  resulting  f{t,9)  will  be  psd,  in  violation  of  condition  #3. 

3.  Sine  Functions 

Regarding  /,j  as  samples  of  a  continuous  function  bandlimited  in  radial  wavenumber  to  [— tt,;:] 
(note  that  this  may  or  may  not  actually  be  the  case),  the  choice  0(x,y)  =  +v  )  jgg^^g 

y  x' 

f{t,9)  =  ^2  ^2  />.j5rnc(t  -  icos<?  -  jsinS)  (7) 

i  =  -M  j=~M 

since  +v  )|  _ 

yr^-fy^ 

This  discrete  Radon  transform  is  easily  computed,  satisfying  condition  #1.  The  projection-slice 
property  (condition  #2)  is  automatically  satisfied.  Condition  #3  that  f{t,9)  be  psd  in  t  may  not 
seem  at  first  glance  to  be  satisfied,  but  if  /,  j  is  psd,  and  regarded  as  samples  of  a  bandlimited  function 
sampled  above  the  Nyquist  rate,  then  its  interpolation  f{x,y)  is  also  psd.  This  means  that  the  2-D 
Fourier  transform  of  f{x,y)  is  non-negative,  and  by  the  projection-slice  property  the  1-D  Fourier 
transform  of  f{t,9)  is  non-negative  for  each  9,  so  that  f{t,9)  is  psd,  as  required. 

f(t,9)  can  also  be  seen  to  be  psd  as  follows.  First,  consider  the  j  =  0  terms.  These  can  be 
interpreted  as  the  interpolation  of  sampled  values  /,_o  using  a  discretization  length  =  cos  9  <  \ 
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(corresponding  to  a  sampling  rate  above  the  Nyquist  rate).  Repeating  this  argument  for  each  value 
of  j,  f{t,6)  can  be  interpreted  as  a  sum  of  delayed  signals,  each  of  which  is  bandlimited  and  psd. 
Hence  these  sum  must  be  psd.  Furthermore,  the  projection- slice  property  also  implies  that  f(t.d) 
is  bandlimited  to  [-7r,7r],  so  that  it  may  be  sampled  in  t  and  standard  discrete-time  1-D  spectral 
estimation  techniques  applied  to  it. 

Note  that  regarding  fij  as  samples  of  a  continuous  function  bandlimited  in  wavenumber  to  -tt  < 
kx,ky  <  TT  leads  to  the  choice  4>{x,y)  =  sinc{x)sinc(y).  The  laok  of  radial  symmetry  in  <p(x.y)  makes 
its  Radon  transform  a  fl-dependent  sine  function.  The  above  argument  for  (7)  is  also  applicable 

to  this  Ccise. 

B.  Polar  Raster 

We  now  consider  the  same  problem,  but  on  a  discrete  polar  raster  of  points  having  radius  N  and 
M  racial  slices.  This  is  the  problem  of  interest,  since  our  data  is  given  on  such  a  discrete  lattice. 

The  major  difference  between  the  rectangular  and  polar  rasters  is  that,  on  a  polar  raster,  translation 
must  be  described  in  terms  of  polar  coordinates.  Hence  becomes  /i_„,  where  integer  i  denotes  radius 
and  integer  n  denotes  an  angle  2xn/M  radians  from  the  horizontal  (i.e.,  the  radial  slice).  Equation 
(3)  for  interpolating  the  /,,j  must  be  modified  to 

N  M 

nr.o  =  EE  /..n^((r,C)  -  (i,27rn/M)]  *  (8) 

t=0  n=l 

where  **  denotes  a  2-D  convolution  in  polar  coordinates  and 

“  (i,2:rn/M )]  =  ^(rcosC  -  i  cos(27rn /M)}S(r  sin  (  -  isin(2Tn/M))  (9) 

is  a  2-D  impulse. 

The  discrete  Radon  transform  of  is  again  defined  to  be  the  same  as  the  Radon  transform 

of  its  interpolation  f{r,C,).  Using  the  property  that  the  Radon  transform  of  a  convolution  is  the 
convolution  (in  t)  of  the  Radon  transforms  (obvious  from  the  projection-slice  property),  equation  (8) 
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is  modified  to 

N  M 

f{tj)  =  n{f{r.O}  =  ^^/.,„7^{^[(r,C)-(^2xrl^U)]}t7^{0(r.C)} 

i=:0  71=  1 

N  M  'V  A/ 

=  EE  —  i  cos{6  -  2TTn/M))  *  0{t,9)  =  EE  fi,nMt  -  Jcos(0  -  2TrnfM),6) 

:=0  n  =  1  i=0  n=l 

where  TZ{(t>(r,Q}  =  <p{t,9)  and  we  have  used  the  fact  that  7J{/(x  -  a)}  =  f(t  —  e  ■  a,e)  [12],  where  x 
and  a  are  vectors  and  e  is  a  unit  vector. 

1.  Impulses 

Choosing  for  the  interpolating  function  the  2-D  impulse  function  (p(r,Q  -  S(r)  results  in 

N  M 

f{t,9)  =  EE  f,,n9it  -  icos{9  -  2-iTnlM))  (10) 

i=0  n=l 

For  this  choice  of  interpolating  function,  the  discrete  Radon  transform  is  again  zero  unless  the  line 
passes  precisely  through  a  lattice  point.  This  happens  when  t  =  xcos9  +  j/sin0  =  icos{9  -  2i:nlM), 
i.e.,  X  =  i  cos  9  and  y  =  i  sin  V.  Again,  only  a  finite  number  of  lines  pa.ss  through  more  than  one  lattice 
point;  hence  this  choice  is  unsuitable  for  2-D  spectral  estimation. 

2.  Sine  Functions 

The  choice  4>{r,()  =  results  in 

N  M 

hi-»)  =  T.Y.  f,_nsinc{t  -  icos{9  -  27vnfM))  (11) 

1=0  n=  I 

This  is  a  plausible  choice.  However,  this  choice  of  interpolating  function  does  NOT  correspond  to 
interpolating  samples  of  a  bandlimited  function,  since  the  sampling  is  performed  on  a  polar  rcister.  The 
problem  of  interpolating  a  bandlimited  function  from  samples  on  a  polar  raster  has  been  considered  in 
[11];  however  [11]  required  that  the  samples  be  taken  at  non-uniform  radial  distances,  corresponding 
to  the  interlaced  zeros  of  Bessel  functions  of  the  first  kind  of  various  orders.  Hence  the  results  of  [11] 
are  not  applicable  here. 

3.  Gaussian  Functions 
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The  choice  4>{r,Q  =  e  results  in 

N  M 

/(^^)  =  Z  E  (  12) 

t=0  n=l 

since  7l{e~^^}  =  .  This  is  easy  to  compute,  satisfying  condition  #1,  and  the  projection-slice 

property  (condition  #2)  holds  automatically.  However,  unlike  the  sine  interpolating  functions,  a  set 
of  psd  guarantees  that  f{t,d)  will  be  psd  in  t,  so  that  condition  #3  is  also  satisfied.  This  is  true 
since:  (1)  the  Fourier  transform  of  a  Gaussian  function  is  also  Gaussian;  and  (2)  a  Gaussian  function 
is  always  positive.  We  now  prove  that  condition  #3  is  satisfied. 

Recall  that  the  interpolated  function  /(x),  where  x  is  a  vector,  is  defined  by 

/V  M 

/,,„<5(x  -  (i,27rn/M))  + +d>(x)  (13) 

1=0  n=l 

where  (i,27rn/M)  is  a  point  on  the  polar  raster  and  ♦+  denotes  a  2-D  convolution.  Taking  the  2-D 
Fourier  transform  of  this  yields 

N  M 

=  E  E  ( 14) 

i=:0  n=l 

where  k  is  a  wavenumber  vector  and  $(k)  =  We  recognize  the  expression  multiplying  $(k) 

as  the  2'D  discrete-time  Fourier  transform  (2DDTFT)  of  in  discrete  polar  coordinates;  since  is 
psd  this  is  non-negative.  If  $(k)  is  non-negative,  F(k)  is  also  non-negative,  and  by  the  projection-slice 
property  f{t,0)  is  psd  in  t.  Hence  conditions  #l-#3  are  all  satisfied  if:  (1)  $(k)  >  0;  (2)  both  0(x) 
and  (p{t,6)  have  simple  forms  m  polar  coordinates;  and  (3)  both  d>(x)  and  have  “reasonable” 

forms  that  interpolate  the  data  (this  excludes  impulses). 

C.  Choice  of  Interpolating  Function 

At  one  extreme  we  have  the  impulse  interpolating  function,  and  at  the  other  extreme  we  have  the 
sine  interpolating  function.  The  gaussian  interpolating  function  occupies  a  middle  ground.  Although 
there  is  no  firm  basis  for  choice,  we  have  chosen  the  gaussian  interpolating  function  because  it  occupies 
the  middle  ground. 


Another  reason  to  choose  the  gaussian  interpolating  function  is  that  by  varying  the  variance,  we 
can  control  the  width  of  the  interpolating  function  in  both  space  and  wavenumber.  Note  from  (14) 
that  the  interpolation  operation  plays  the  role  of  filtering,  and  that  the  resulting  spectrum  depends 
proportionally  to  the  spectrum  of  the  interpolating  functions. 

_£W 

More  specifically,  the  Fourier  transform  of  a  gaussian  function  g(x,y)  =  e  is  equal  to 

G{'Wi,W2)  =  TT{g{x,y)]  =  which  means  that  the  spectrum  of  the  gaussian  interpo¬ 

lating  function  is  still  a  gaussian  function  with  bandwidth  inversely  proportional  to  cr^  (  variance).  So 
if  we  choose  a  Irirge  <7,  the  interpolating  function  has  a  slowly  decaying  tail  and  behaves  like  a  low-pass 
filter.  Hence,  we  can  get  a  smooth  spectrum  with  low  fluctuations.  However,  the  high  frequency 
components  would  be  highly  degraded.  On  the  other  hand,  if  we  choose  a  small  a,  the  interpolating 
function  approaches  an  impulse  and  behaves  like  a  high-pass  filter.  However,  in  this  case  the  evaluation 
of  the  Radon  transform  in  some  directions  does  not  account  for  enough  data  points  to  fully  reflect 
the  nature  of  the  desired  spectrum,  therefore,  large  fluctuations  are  likely  to  occur.  Note  that  due  to 
the  bell  shape  of  the  spectrum  of  the  gaussian  interpolating  function,  low  frequency  components  are 
e.xpected  to  be  less  degraded  and  provide  better  resolution.  If  we  have  a  priori  information  about  the 
nature  of  the  spectrum,  we  can  choose  a  suitable  a  accordingly. 

In  view  of  the  effect  of  a  on  the  resulting  spectrum,  in  the  following  we  propose  the  following 
gaussian  interpolating  functions  with  different  a  and  normalization  constants  (c  is  a  constant): 


g{x,y)  =  e  2<t2  ,  with  constant  a 
g{x,y)  =  e  2<r2  , a  =  c  ■  i  (i  denotes  radius  of  the  available  data) 


g{x,  y)  =  e  ,  cr  =  c  ■  i'  (  i'  denotes  radius  of  the  data  evaluated)  (17) 

1 

g{x,y)  =  . —  e  ,cr  =  c  •  i(i  denotes  radius  of  the  available  data)  (18) 

V25r  cr 

The  function  (15)  is  the  most  basic  one.  The  functions  (16)  and  (17)  take  into  account  the  fact  that 
for  data  points  farther  away  from  the  origin,  the  superposition  effect  using  interpolation  will  not  be 
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thf  same  if  we  use  a  constant  a.  If  a  increases  proportional  to  the  radius,  the  interpolating  function 
would  decay  slower  as  the  radius  increases,  so  that  the  weighting  can  be  kept  the  same  independent 
of  the  radius.  The  interpolating  function  of  (18)  is  a  normalized  one  in  the  sense  that  the  weighting 
of  the  available  data  point  is  normalized  to  1  as  in  the  discretization  case. 

IV  HIGH-RESOLUTION  SPECTRAL  ESTIMATION 

We  now  focus  on  the  gaussian  interpolating  functions,  and  use  them  to  derive  an  analytically 
explicit  procedure  for  spectral  estimations  with  data  points  defined  on  a  polar  raster.  Following  the 
notations  used  in  section  III,  we  obtain 


N  M 


=  HI]  y(,^^)e-[(x-r.cose,„F+(y-r.sin«„.F]/2^2 


(19) 


1=0  m=l 


N  M 


/(t,«  =  K{/(i,v)}  =  <TvS£5;  /(t,m)e  «-))  /2<' 


where  r,  =  tSr  and  6m  =  2irm/M .  Using  the  shifting  property  of  the  Radon  transform  and  7?.{e  ) 

it  is  straightforward  to  show  that  the  exact  Radon  transform  of  (19)  is 

/( t ,  m  )e  ‘  ~  ^ 

1=0  m=l 

The  complete  procedure  for  estimating  the  power  spectral  density  of  a  zero-mean  homogeneous 
random  field  given  discrete  data  {/(t,m),0  <  i  <  N,1  <  m  <  M}  and  using  the  autocorrelation 
method  of  linear  prediction  is  as  follows; 

1.  Use  (20)  to  compute  the  Radon  transform  of  the  data  from  at  some  values  of  t  with 

equal  spacing  and  ! L,j  =  I . . .  L; 

2.  For  each  </>,  compute  the  autocorrelation  of  the  Radon  transform;  i.e.,  autocorrelate  the  results 
of  (1)  along  each  slice  by  (2)  (x  =  (x,y)) 

r^(x)  ^  72^{r(x)}  =  7^4/(x)  ♦  ♦/(-x)}  =  R*{/(x)}  ♦  il^{/(-x)}  =  /(<,</.)  *  /(t,  -<t>)  (21) 

3.  For  each  d*,  fit  a  1-D  AR(p)  (p  may  vary  for  different  4>)  model  to  the  projection  data  using 

the  autocorrelation  estimates,  so  we  can  get  a  set  of  linear  prediction  coefficients,  say  {h{k)), 
corresponding  to  =  0,...,p—  1; 


15 


4.  For  each  0,  use  the  1-D  AR(p)  model  to  extend  the  covariance  by  [5] 

p-i 

T'^U)  =  -  X]  j  >  p 

k=\ 

5.  Take  1-D  Fourier  transforms  along  each  slice.  This  is  the  estimated  2-D  spectral  density. 

Some  comments  are  in  order  here: 

1.  The  “autocorrelation”  assumption  of  windowing  data  to  zero  for  i  >  N  is  required  in  order  to 

compute  the  Radon  transform  of  the  data,  Jnce  even  x(0,(p)  depends  on  >  A'}: 

2.  It  is  therefore  consistent  to  make  the  same  assumption  in  fitting  the  1-D  .4R  models  to  each  slice 
of  the  Radon  transform; 

3.  As  noted  above,  the  Radon  transform  and  autocorrelation  operations  commute,  so  the  above 
procedure  can  properly  be  termed  an  “autocorrelation”  procedure: 


4,  The  covariance  function  of  the  interpolated  function  (19)  is 


r(u,v)=  /  f{x.y)f(x  +  u,y  +  v)dx  dy 

Jx  Jy 


I  j  k  I 


which  is  a  Gaussian-weighted  sum  of  the  available  discrete  data  sample  -  the  weighting  depends 
on  the  distajice  vector  between  two  points.  Equation  (23)  also  provides  a  method  to  compute 
the  covariance  for  data  defined  on  a  polar  raster. 


6.  A  significant  advantage  of  this  procedure  is  that  it  guarantees  a  psd  covariance  extension  of  the 
finite  set  of  lag  estimates  computed  from  the  data,  ihis  is  required  to  ensure  a  non-negative 
power  spectral  density  estimate; 


6.  Since  the  correlation  matching  property  holds  for  the  1-D  linear  prediction  technique  along  each 
slice  of  the  Radon  transform,  it  also  must  hold  for  the  entire  2-D  spectral  estimate,  in  that 
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the  2-D  covariance  function  derived  from  the  2-D  spectral  estimate  will  match  the  estimated 
covariance  lags  in  the  Radon  transform  domain. 

V  SIMULATION  AND  DISCUSSION 

In  this  section  we  provide  some  examples  to  demonstrate  the  proposed  spectral  estimation  proce¬ 
dure.  The  data  are  assumed  to  be  available  on  a  polar  raster  (/  x  M,  where  /  is  the  number  of  points 
along  each  direction  (with  radial  spacing  Sr),  and  M  is  the  number  of  angular  partitions).  A  gaussian 
function  is  used  as  the  interpolating  function  4>(x,y). 

To  compute  the  2-D  periodogram,  we  resample  the  interpolated  data  on  a  rectangular  lattice  (with 
spacing  Sx,Sy  along  the  abscissae  and  ordinate,  respectively),  zero-pad  the  points  along  each  axis  from 
L  points  to  128  points,  and  then  take  a  128  X  128  2-D  discrete  Fourier  transform.  To  use  the  proposed 
new  spectral  estimation  algorithm,  we  compute  the  Radon  transform  of  the  interpolated  data,  and 
then  sample  it  on  I'  x  M'  polar  raster,  where  /'  is  the  number  of  points  along  each  direction  (with 
spacing  Sr'),  and  M'  is  the  number  of  angular  partitions.  The  proposed  spectral  estimation  procedure 
is  then  performed  independently  along  each  slice. 

Note  that  in  the  following  figures,  the  abscissae  and  ordinate  denote  the  x  and  y  axis  for  the  2-D 
periodograms,  and  radius  and  angles  for  the  proposed  method,  respectively.  For  clarity,  only  one 
quadrant  or  one  half  of  the  spectrum  is  shown  in  the  following  figures.  This  is  appropriate  since  the 
proposed  method  generates  the  spectral  estimate  slice- by-slice.  However,  the  figures  for  the  proposed 
method  must  be  visually  interpreted  differently. 

EXAMPLE  1 

The  algorithm  of  [17]  was  used  to  generate  a  single  realization  of  a  zero-mean  isotropic  random 
field  with  power  spectrum  density 

5i(u;i,u;2)  = 


which  is  a  circularly  symmetric  spectrum  as  shown  in  Figure  2.  The  available  data  was  3  x  6  (/  = 


o 


3,.\/  =  6)  with  radial  spacing  br  -  o.2.  We  used  b^  —  by  -  by<  =  0.1,  Z,  =  25,7'  =  6.  A/'  =  36.  and 
chose  t'  °  interpolating  function  defined  in  (16)  with  a  —  0.15;. 

The  resulting  spectral  'stimates  are  shown  in  in  Figures  3  (for  the  2-D  periodogram )  and  A  (for 
the  proposed  method  with  AR(4)  modeling  along  each  slice).  The  estimated  spectra  in  both  figures 
are  similar,  and  close  to  the  true  spectrum. 

EXAMPLE  2a 

The  algorithm  of  [17]  was  used  to  generate  a  single  realization  of  an  isotropic  random  field  with 
power  spec  rum  density 

I  10  if  tn?  +  u’2  <  (0.645x)‘ 

52(u’,,(rT)  = 

I  0  otherwise 

which  is  plotted  in  Figure  5.  The  available  data  had  7  =  3  and  M  =  6  with  radial  spacing  br  =  0.2. 
We  used  bx  =  by  =  br'  =  0.1,7  =  31,7'  =  12.  M'  =  72,  and  the  interpolating  function  defined  in  (16). 

The  resulting  spectral  estimates  are  shown  in  Figures  6  (2-D  periodogram)  and  7  (spectrum  using 
the  proposed  method).  Note  that  the  proposed  procedure  provides  better  transition  performance  on 
the  di  icontinuity  of  the  original  spectrum.  However,  the  1-D  e.xtrapolation  of  the  1-D  covariance  also 
reuses  a  slight  increase  of  the  high  frequency  components  in  Figure  7. 

EXAMPLE  2b 

The  algorithm  of  [17]  was  used  to  generate  a  single  realization  of  an  isotropic  random  field  with 
power  spectrum  density  .S2('Wi,W2)  (same  as  for  E.xample  2a)  plus  a  white  gaussian  noise  field  at  a 
■V.V/i  equal  to  7dB.  Now  a  -  0.3;  is  used  in  the  interpolating  function  (16):  all  other  parameters  are 
the  same  as  in  E.xample  2a. 

The  resulting  spectral  estimates  of  power  spectral  density  are  shown  in  Figures  8  ( 2-D  periodogram ) 
and  9  (spectrum  using  the  proposed  method),  respectively.  No*e  that  the  estimated  spectrum  in  Figure 
9  is  not  as  good  as  that  in  Figure  7,  due  to  the  additive  while  noise.  However,  it  is  still  better  than 
the  2-1)  periodogram  shown  in  Figure  8. 
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EXAMPLE  3a 


Here  the  random  field  whose  power  spectral  density  is  to  be  estimated  is  the  deterministic  2-D 
signal 

Di{x,y)  =  cos(u;iar  +  W2y)  +  cos(u>3i  +  W4y) 

where  (wi,W2)  =  (O.173x,O.l7r),(u>3,i04)  =  (0.12jr, 0.208t), x  =  iSrCos(jO),y  =  iSrSin{j0),O  <  i  < 
IA<j<  M,  and  6  =  27r/12  (M  =  12).  This  consists  two  closely- spaced  low  frequency  sinusoids.  The 
available  data  has  /  =  12  and  M  =  12  with  radial  spacing  6^  =  1.  We  used  6i  =  6y  =  8^'  =  1,L  = 
31./'  =  12,  M'  =  72.  and  the  normalized  interpolating  function  defined  in  (18).  cr  is  chosen  to  be 
0.02i,  which  is  much  smaller  than  the  spacings  of  the  interpolated  points,  so  the  interpolating  function 
is  close  to  an  impulse  function.  This  is  a  reasonable  choice;  since  if  a  is  too  large,  the  spectrum  will 
be  smeared  by  those  of  the  adjacent  directions,  which  wiU  reduce  the  overall  resolution.  An  AR(3) 
model  is  used  to  extrapolate  the  1-D  covariances  in  the  proposed  method. 

The  resulting  spectral  estimates  are  shown  in  Figures  10  (2-D  periodogram)  and  11  (spectrum 
using  the  proposed  method).  Note  that  the  2-D  periodogram  in  Figure  10  shows  only  one  peak,  so 
that  it  fails  to  resolve  two  sinusoids.  In  contrast,  for  the  proposed  method  in  Figure  11,  two  peaks  are 
apparent  and  are  located  at  (O.ITStt, O.lOOx)  and  (O.IITtt, 0.2037r),  respectively,  which  are  very  close 
to  the  true  frequencies.  More  accurate  results  were  achieved  using  more  points  along  each  direction. 
Also  note  that  the  artifacts  in  Figure  10  are  exaggerated  in  appearance,  due  to  the  nature  of  the 
plotting  axes.  A  radial,  rather  than  rectangular,  plot  of  axes  radius  r  vs.  angle  0  would  reduce  the 
visibility  of  the  artifacts. 

EXAMPLE  3b 

Here  the  random  field  whose  power  spectral  density  is  to  be  estimated  consists  of  the  deterministic 
signal  from  Example  3a  plus  a  single  realization  of  a  white  gaussian  noise  field  with  unit  power: 

£*2(2:,  y)  =  cos(wiX  -f  W2y)  +  cos(w3X  W4y)  +  w{x,y) 

We  use  the  normalized  interpolating  function  defined  in  (18)  with  a  =  0.02z.  and  .4/2(3)  modeling 
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along  each  slice  in  the  Radon  transform  domain.  All  other  parameters  are  the  same  as  in  Example  3a. 

The  resulting  spectral  estimates  are  shown  in  Figures  12  (2-D  periodogram)  and  13  (proposed 
method).  ,41though  some  spurious  peaks  appear  in  Figure  13,  due  to  the  additive  white  noise,  the 
two  peaks  for  the  sinusoidal  input  signals  are  still  obviously  distinguishable  in  Figure  13.  The  2-D 
periodogram  in  Figure  12  not  only  contains  many  spurious  peaks,  but  also  fails  to  resolve  two  sinusoids. 

Use  of  a  Bessel  function  as  the  interpolating  function  gave  less  satisfactory  results;  in  the  resulting 
spectral  estimate  the  two  sinusoidal  peaks  are  not  resolved. 

VI  2-D  CORRELATION  MATCHING  ON  A  POLAR  RASTER 

A.  Introduction 

The  above  2-D  spectral  estimation  method  may  be  used  to  estimate  2-D  spectra  on  a  polar  raster, 
either  directly  from  data  or  from  specified  covariance  lags.  In  the  latter  case,  however,  the  above 
method  does  not  preserve  the  specified  covariance  lags:  The  inverse  2-D  Fourier  transform  of  the  2-D 
power  spectral  density  (the  2-D  covariance)  does  not  match  the  given  covariance  lags.  Hence  it  does 
not  satisfy  correlation  matching  in  the  2-D  plane. 

In  this  section  we  propose  a  procedure  that  extends  a  given  set  of  2-D  covariance  lags,  specified 
inside  a  disk  of  radius  R,  into  a  2-D  covariance  function  specified  everywhere  in  and  which  matches 
the  given  2-D  covariance  lags.  It  guarantees  that  the  extended  covariance  is  a  2-D  psd  (positive  semi- 
definite)  function,  ensuring  that  the  power  spectral  density  will  be  non-negative  everywhere.  .41though 
the  procedure  is  applied  to  functions  defined  continuously  on  72^.  it  may  also  be  applied  to  discrete 
covariance  lags  on  a  polar  raster  by  interpolation,  as  described  above.  We  also  discuss  when  such  an 
extension  is  impossible,  and  how  this  is  manifested  in  the  algorithm. 

The  problem  addressed  is  as  follows: 

Covariance  Extension  Problem:  Given  a  set  of  covariance  lags  {/(r,  0),r  <  i?}  for  oome  radius  R. 
determine  an  extension  {fir.0),r  >  /Z}  of  the  given  lags  such  that:  (1)  f{T,9)  agrees  with  the  given 
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values  {f{r,ff),r  <  ^};  and  (2)  f(r,d)  is  a  2-D  psd  function,  meaning  that  its  2-D  Fourier  transfornj 
is  real  and  non-negative  everywhere. 


B.  Radon  and  Backprojection  Transforms 

To  explain  the  procedure,  and  to  explain  why  it  is  necessary  for  covariance  extension,  we  define  the 
Radon  transform,  the  backprojection  transform,  and  note  some  causality  and  psd-preserving  properties 
of  each  transform. 

Radon  Transform  in  Polar  Coordinate:  Let  f{x)  =  fir,ff)  be  a  function  defined  on  a;  6  R?.  Then 
the  Radon  transform  f{t,4>)  of  f{r,6)  is 

,  roo  r2r 

f(t,4>)  =  7Z{f(r,d)}=  I  /  f(r,6)6(t  -  r  cos(6  -  (t>))r  d6  dr  (24) 

Jo  Jo 

Note  that  the  Radon  transform  is  the  line  integral  of  f(r,9)  along  the  line  t  =  x  cos  0  -t-  y  sin  0,  where 
X  —  rcosO  and  y  =  rsin^. 

Backprojection  Transform:  Let  f(x)  =  /(»*, 0)  be  a  function  defined  on  i  €  TZ^.  Then  the  back- 
projection  transform  f{t,<p)  of  f{r,d)  is 


/(t,0)  =  C{/(r,0)}  =  /  /  f(r,9)6(T  -  tcos{9  -  (j)))d9  dr 

Jo  Jo 

-  f  f{r=:tcos{9  —  d)).9)d9 
Jo 


(25) 


Note  that  the  backprojection  transform  is  the  circular  mean  of  f{r,9)  on  the  circle  r  =  tcos{9  —  0) 
("ote  that  the  point  {r,9)  coincides  with  the  point  {  —  r,9  -f  tt);  this  is  why  the  integral  over  9  varies 
only  from  0  to  tt  rather  than  27r).  This  circle  passes  through  the  origin,  has  diameter  t.  and  has  its 
center  at  {(tj2ycos(p,{t/2)sin4)).  The  backprojection  transform  is  also  half  the  adjoint  of  the  Radon 
transform  ([12],  p.l34);  note  that  (24)  and  (25)  differ  primarily  in  that  r  and  t  have  been  interchanged. 

Anticausality  of  Radon  Transform:  Let  f{t,4>)  =  TZ{f(r,9)}.  Then  for  any  T  >  0,  fiT,4>)  depends 
only  on  the  vaJues  {fir,9),r  >  T}.  This  is  clear  since  f(T,9)  is  the  line  integral  of  /(•)  along  the 
line  T  =  xcos9  ys'in9,  whose  minimum  distance  from  the  origin  is  T.  It  is  also  true  that  given 


{f(t,<p),t  >  T},  it  is  possible  to  reconstruct  {/(r.0),r  >  T);  an  explicit  formula  has  been  given  by 
Cormack  [18]. 

This  anticausaJity  explains  why  the  abo%-e  spectral  estimation  proceciure  does  not  preserve  the 
given  covariance  lags.  Any  given  covariance  lag  at  radius  T  depends  on  all  values  t  >  T  of  the  Radon 
transform  of  the  covariance.  But  these  values  for  t  >  R  have  been  changed  from  zero  by  the  TD 
extensions  applied  to  f{t,6)  independently  for  each  6.  Hence  the  extended  covariance  does  not  match 
the  given  covariance  lag. 

Causality  of  Backprojection  Transform:  Let  f{t,(p)  =  B{f(r,9)}.  Then  for  any  T  >  0,  f[T,(p) 
depends  only  on  the  values  {/(r,  0),r  <  T}.  This  is  clear  since  f(T,9)  is  the  circular  mean  of  /(•) 
along  the  circle  r  =  T  cos(9  -  <h),  so  that  r  <  T  always.  Another  way  to  see  this  is  to  note  that 
backprojection  at  the  point  (T,  d>)  can  also  be  viewed  as  the  integration  over  all  lines  r  =  x  cos9  +  ys\n  9 
passing  through  (T,  ©);  any  such  line  must  pass  closer  to  the  origin  than  T.  so  that  any  such  line  will 
have  r  <  T.  It  is  also  true  that  given  {/(<,<£>),<  <  T},  it  is  possible  to  reconstruct  {f(r,9),r  <  T}  (see 
[19]). 

Inverse  Radon  Transform  by  Backprojection:  Let  f{t,<p)  =  7Z{f{r,9)}.  Then  we  may  recover 
f{r,9)  from  f[t,0)  by  computing 

f{r.9)  =  (26) 

where  H  denotes  the  Hilbert  transform  'H{f(t))  =  f{t)  *  This  is  the  well-known  technique  of 
filtered  backprojection  [12].  Note  that  here  f{t,(i>)  is  regarded  as  a  collection  of  functions  indexed  by 
(p,  rather  than  as  a  continuous  function  of  polar  coordinates  {t,0). 

Positive  Semi-Definite  Properties  of  H  and  B:  Let  f(r.9)  be  a  2-D  psd  function.  Then  f{t,<p)  = 
7^{/('''^)}  is  3-  l‘D  psd  function  in  t  for  each  4>  by  the  projection-slice  theorem  of  the  Radon  transform, 
and  .  0)  is  also  a  1-D  psd  function  in  t  for  each  0,  since  the  filtering  operation  corresponds  to 

multiplication  by  \k\  in  the  Fourier  domain;  lFt-*k{'H^^f(t,(t>)}  =  [fc] J'(^jt{/(t, 0)}.  Hence  the  inverse 
backprojection  transform  B~^  maps  1-D  psd  functions  to  2-D  psd  functions,  as  does  the  inverse  Radon 
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transform. 


C.  Covariance  Extension  Procedure 

W'r  propose  the  following  procedure  for  e.xteiuling  a  given  sot  of  covariance  lags  f{x),x  €  TZ^,  |x|  < 
R  into  a  function  /(i),x  6  specified  everywhere  in  TZ^  and  which  agrees  with  the  given  set  of 
covariance  lags: 

1.  Compute  the  Radon  transform  f(t,<p)  of  the  function  f(r,0)  defined  by  f(r,0)  =  f(x)  if  r  = 
|x|  <  R;  0  if  r  >  f?.  Note  that  f{t,(i>)  =  0  for  /  >  /?  by  anticausality  of  the  Radon  transform; 

2.  Compute  H from  j(t,<i)).  .Note  that  ^  0  for  t  >  R  due  to  the  smearing  effect 

of  the  Hilbert  transform  'H\ 

3.  Replace  the  values  of  for  <  >  /Z  with  values  such  that  is  1-D  psd  in  t  for 

each  <i>.  Call  this  new  function  note  that  for  t  <  R\ 

4.  Compute  F{r,9)  =  F{r.6)  is  the  2-D  psd  e.xtended  covariance  function. 

By  the  causality  property  of  6,  F{r,6)  =  f{r,0)  for  r  <  R,  so  that  the  extended  covarianci*  matches 
the  given  covariance  lags.  By  the  psd  property  of  B,  F{r,0)  is  a  ‘2-D  psd  function  since  W^/(Cd))  is 
a  l-D  psd  function  in  t  for  each  4>.  Hence  we  have  successfully  extended  the  given  covariance  lags  into 
a  2-D  psd  covariance  function  F(r,9). 

Note  that  the  only  difference  between  this  procedure  and  the  previous  procedure  is  that  of  the 
Radon  transform  of  the  given  covariance  lags  is  computed  before  performing  the  1-D  psd  extensions, 
instead  of  after.  This  seemingly  minor  change  allows  the  use  of  the  causality  property  of  B,  instead  of 
the  anticausality  property  of  TZ~^ . 

It  might  seem  at  first  glance  that  this  constructive  procedure  allows  any  2-D  set  of  covariance  lags 
specified  inside  a  disk  of  radius  R  to  be  extended  into  a  2-D  psd  covariance  function.  This  seems 
to  contradict  the  known  fact  [7]  that  some  sets  of  covariance  lags  are  not  extendible  into  a  2-D  psd 
covariance  function.  I'he  resolution  of  this  paradox  is  found  by  noting  that  it  may  not  be  possible  to 


form  a  1-D  psd  function  from  the  computed  from  the  given  covariance  lags.  For 

example,  if  for  any  (j)  there  is  a  t  <  R  such  that  then  the  1-D  psd  extension 

cannot  be  performed  for  that  d>,  since  any  psd  function  g(t)  must  have  the  property  that  g(t)  <  5(0). 
This  explains  how  a  2-D  extension  may  be  impossible. 

Another  important  question  is:  Can  all  possible  2-D  psd  extensions  of  the  given  fix),  x  G  1Z^,  |x|  < 
R  be  found  from  all  of  the  possible  1-D  psd  extensions  of  the  Unfortunately,  the  answer 

is  no.  To  see  why,  we  now  investigate  briefly  the  nullspace  of  B. 

D.  NuUspace  of  Backprojection  Operator 

Let  fix),x  £  'R?,\x\  >  iZ  be  an  extension  of  given  values  f(x),\x\  <  R.  Now  compute  the  filtered 
Radon  transforms  of  both  the  given  values  /(a:), |a:|  <  R  and  the  extended  values  /(i),|ij  >  R  (note 
the  latter  is  a  “hollow”  function): 

fextil><t>)  =  'H^U{fix),\x\  >  R;0,  |a:|  <  R}\ 

fxnt[t,4>)  =  'H^^TL{f(x),\x\  <  R:0,  |x|  >  R} 

Here  fint(t,<t>)  is  the  function  which  is  extended  to  create  a  1-D  psd  in  the  procedure  we  proposed 
above,  and  by  construction,  B{feit{t,0)}  =  0  for  t  <  iZ. 

VVe  now  consider  the  following  question:  Does  feit(t,4>)  =  0  for  t  <  iZ?  That  is,  is  there  a  non- zero 
function  fexi{t,0)  such  that  B{fextit,<P)}  =  0,  i.e.,  does  5  have  a  non-empty  nuUspace? 

The  significance  of  this  question  is  as  follows.  If  B  does  NOT  have  a  non-empty  nuUspace,  then 
f extit, 4>)  =  0  for  t  <  iZ.  Then 

H^7^{/(l)}  =  fextit,<t>)+  fmtit,<t>)  =  fir.tit,<P),t  <  R 

and  ANY  extension  of  given  values  /(i),  )x|  <  iZ  is  associated  with  an  extension  of  fxntit,<P),  so  that 
ALL  2-D  psd  extensions  of  the  given  lags  are  associated  with  1-D  psd  extensions  of  fintit,<i>}-  But  if 
B  HAS  a  non-empt>  nuUspace  containing  some  non-zero  fextit,4>),  then 

'H^TZifix)}  =  fextit,(p)  +  fxntit,4>)  fxntit,4)),t  <  R 

at 
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so  that  the  extended  f{x)  is  NOT  associated  with  1-D  extensions  of  <  R,  but  with  1-D 

extensions  oi  fint{t,4))  +  fint{t,4>),t  <  R-  This  implies  that  not  all  2-D  extensions  of /(x),  |x|  <  /Z  can 
be  found  from  TD  extensions  of 

Unfortunately,  B  DOES  have  a  non-empty  nullspace,  so  that  it  is  not  true  that  all  2-D  psd  exten¬ 
sions  of  a  given  set  of  covariance  lags  can  be  found  using  the  procedure  proposed  above.  Indeed,  it 
might  seem  that  for  ANY  function  /(x)  such  that  /(x)  =  0,lxj  <  R,  we  would  have  ^  0 

for  t  <  R.  Of  course  this  is  not  true-indeed,  our  procedure  constructs  functions  /(x)  =  0,  |x|  <  R 
such  that  ?f^7!i{/(x)}  =  0  for  t  <  iZ.  Furthermore,  we  have  the  following  theorem; 

Theorem: 

Let  f{T,d)  =  0  for  r  <  fZ  and  let  0)}  =  0  for  t  <  f,  for  any  €  >  0.  Then  'H-^^TZ{f{r,6)}  = 

0  for  all  i  <  /Z. 


Proof:  To  prove  this  theorem  we  need  the  following  lemma: 

Lemma: 

Let  g(T,d)  be  any  continuous  function  equaling  zero  at  the  origin.  Define  g'(r,8)  =  g(R(r,0). 
compute  the  Radon  transform  g'(t,4>)  of  g'{r,9),  and  define  g(t,4>)  =  g'{R/t^<i>).  Then  g{t,(p)  = 

Proof  of  Lemma:  We  have 


g(t,4>)  - 


roo  rziT 

TZ{giR/r,e)}t_fi/t  =  6{Rft  -  rcos{0  -  4>))g{R/r,9)rdedr 

foo  r2ir 

—  6{R/t  -  R/r  cos(9  —  (t>))g(r,9)R^r~^  d9  dr 

Jo  Jo 

roo  r2ir 

=  11  —  tcos(0  —  <p))g(T.9)tRr~^  dd  dr 

Jo  Jo 


(27) 


wh<irp  wp  havp  rhanapd  variables  from  r  to  R/r  and  used  the  scaling  property  6{xR/{Tt))  =  rtlRS(x) 
of  the  impulse.  ■ 

This  result  is  not  surprising:  Reflecting  a  function  across  the  circle  of  radius  R  amounts  to  taking 
its  involute,  and  the  involute  of  a  line  (along  which  the  Radon  transform  is  computed)  is  a  circle  (along 
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which  the  circular  mean,  i.e.,  the  bcickprojection,  is  computed). 

Proof  of  Theorem:  For  convenience  in  using  the  Lemma  switch  the  variables  t  and  r,  and  0  and  6. 
Define  g{t,(t>)  =  Rtf{t,(p).  Then 

nf^-JlifiT^)}  =  B-^{g(t,4>)/(Rt)}  =  g{r,e)/r^ 

where  gir,6)  =  TZ~^  {g(R/t,4t)}r^fi/r.  But  we  are  given  that  g(t,(t>)  =  Rtf{t,4>)  =  0  for  t  <  R, 
which  implies  that  g(R/t,<f))  =  0  for  t  >  1.  But  then  TZ~^ {g(R/t,<p)}  =  0  for  r  >  1,  so  that 
g{r,9)  =  {giR/t,4>)}^_fif^  =  0  for  r  <  i?.  The  result  follows  immediately.  ■ 

The  heart  of  the  above  proof  is  the  conclusion  that  =  0  for  i  >  1  implies  that  {g(Rjt,0)} 

0  for  r  >  1.  Although  this  seems  obvious,  it  is  not  in  fact  true  unless  {g{R/t,<p)}  is  also  known 
to  go  to  zero  sufficiently  fast  as  r  — *  00.  This  is  why  we  also  need  the  condition  h{r,0)  =  0  for  r  <  e, 
so  that  g{r,6)  =  (^)}r  -.Fl/t  is  known  to  be  zero  for  r  >  R/e. 

The  major  point  of  this  section  is  that  the  inability  of  our  proposed  procedure  to  specify  all  of  the 
2-D  psd  extensions  of  the  given  covariance  lags,  due  to  the  non-empty  nullspace  of  is  not  as  bad  as 
it  may  first  appear. 

VII  CONCLUSION 

A  procedure  for  estimating  the  power  spectral  density  of  a  homogeneous  random  field  from  discrete 
data  inside  a  disk  of  finite  radius  has  been  presented.  Unlike  spectral  density  estimators  using  2-D 
linear  prediction  on  a  rectangular  raster,  the  estimated  spectral  density  is  guaranteed  to  be  non¬ 
negative,  since  the  extended  (in  the  Radon  transform  domain)  covariance  is  guaranteed  to  be  psd. 

The  procedure  operates  by  employing  a  novel  interpolation  technique,  using  gaussian  basis  func¬ 
tions  to  compute  the  Radon  transform  analytically  from  a  few  discrete  data  points.  1-D  linear  predic¬ 
tion  is  then  used  along  each  slice  to  compute  spectral  density  estimates  along  each  slice  of  the  Radon 
transform.  The  procedure  can  be  viewed  as  an  “autocorrelation”  method,  since  the  unknown  data  is 
windowed  to  zero  both  for  purposes  of  computing  the  Radon  transform  and  for  fitting  the  1-D  AR 


models  to  each  slice  of  the  Radon  transform.  This  procedure  also  provide  a  high-resolution  spectral 
estimates  for  the  data  on  the  polar  raster.  Some  numerical  examples  are  provided  to  demonstrate  the 
validity  of  this  procedure. 
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FIGURE  HEADING 


1.  Figure  1:  The  polar  raster  on  which  the  2-U  random  field  is  defined  with  M  =  8. 

2.  Figure  2;  Spectrum  of  5i(u;i,uj2)  = 

3.  Figure  3:  2-D  periodogram  for  Example  1. 

4.  Figure  4:  Spectrum  obtained  by  using  the  covariance  extension  for  Example  1. 

5.  Figure  5:  Spectrum  of  S2{'W\,W2)  =  10  if  tUj  +  tnl  <  (0.6457r)^,=  0  otherwise. 

6.  Figure  6:  2-D  periodogram  for  Example  2a. 

7.  Figure  7:  Spectrum  using  the  proposed  method  for  Example  2a. 

8.  Figure  8;  2-D  periodogram  for  Example  2b. 

9.  Figure  9;  Spectrum  using  the  proposed  method  for  Example  2b. 

10.  Figure  10:  2-D  periodogram  for  Example  3a  using  normalized  interpolating  function. 

11.  Figure  11:  Spectrum  for  Example  3a  using  normalized  interpolating  function  and  the  proposed 
method. 

12.  Figure  12:  2-D  periodogram  for  Example  3b  using  normalized  interpolating  function. 

13.  Figure  13:  Spectrum  for  Example  3b  using  normalized  interpolating  function  and  the  proposed 


method. 


(5,7) 


Figure  1:  The  polar  raster  on  which  the  2-D  random  field  is  defined  with  M  —  8. 


Figure  2;  Spectrum  of  Si{wi,W2]  =  4e  o.03(u;J+uij) 
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Figure  3:  2-C  periodogram  for  Fuample  1. 
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Figure  4:  Spectrum  obtained  by  using  tae  covariance  extension  for  Example  1. 
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Figure  5:  Spectrum  of  52(u'i,^e2)  =  10  'f  —  (0.6457r )^,  —  0  otherwise. 
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Figure  6:  2-D  periodogram  for  Example  2a. 
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Figure  7:  Spectrum  using  the  proposed  method  for  Example  2a. 
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9;  Spectrum  using  the  proposed  method  for  Example  2b 


38 


39 


40 


Figure  12:  2-D  periodogram  for  Example  3b  using  normalized  interpolating  function. 
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Figure  13:  Spectrum  for  Example  3b  using  normalized  interpolating  function  and  the  proposed  method. 
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W.-H.  Fang  and  A.E.  Yagle,  “A  Systolic  Architecture  for  New  Split  Algorithms  for 
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Abstract 

Recently,  new  fast  algorithms  have  been  developed  for  computing  the  optimal  linear  least- 
squares  prediction  filters  for  arbitrary  Toeplitz-plus-Hankel  covariances  [1],  In  this  correspondence, 
we  propose  a  systolic  architecture  that  can  fully  express  the  inherent  concurrency  of  this  highly 
parallelizable  algorithm.  The  simplification  of  this  array  structure  for  centrosymmetric  covariances 
is  also  addressed. 
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I  INTRODUCTION 


The  advent  of  high  speed,  low  cost  V'LSI  devices  has  changed  the  field  of  signal  processing  dramat¬ 
ically.  Due  to  its  tremendous  computational  capability,  more  sophisticated  algorithms  have  become 
feasible  through  some  special-purpose  device,  e.g.  ASICs  (application-specific  ICs).  Under  such  cir¬ 
cumstances,  the  conventional  criterion  of  number  of  computatic.ns  alone  is  no  longer  an  effective 
measure  of  overall  performance.  The  structure  of  the  algorithm  and  its  corresponding  hardware  archi¬ 
tecture  play  an  even  more  important  role.  More  specifically,  an  efficient  algorithm  is  defined  in  terms 
of  its  parallelization  and  the  possibility  of  hardware  structures  that  can  fully  express  its  parallelism 
so  that  minimal  time  complexity  can  be  achieved. 

Recently,  new  split  algorithms  were  developed  for  computing  the  linear  least-squares  prediction 
filters  for  arbitrary  Toeplitz-plus-Hankel  covariances  [1],  These  fast  algorithms  not  only  are  highly 
parallel  but  also  perform  regular  iterative  computations.  In  addition,  the  Laplacian  operator  appearing 
in  all  the  recurrences  is  an  operation  involving  only  closest  neighbors.  With  these  desired  properties 
(parallelization  and  io^al  communication),  it  is  natural  that  there  exist  some  highly  concurrent  VLSI 
computing  processors  for  these  fast  algorithms  such  that  the  overall  time  complexity  can  be  further 
decreased.  This  correspondence  confirms  this  conjecture  by  proposing  some  corresponding  hardware 
architectures  which  are  amenable  to  VLSI  implementations. 

Special  attention  will  be  put  on  the  systolic  array  architecture.  This  specific  hardware  structure 
(array  processors)  has  several  desirable  features,  such  as  making  multiple  use  of  input  data  (pipeline 
processing),  using  extensive  concurrency,  involving  only  a  few  types  of  simple  cells  (saving  design  cost), 
and  simple  and  regular  data  flow  (locad  communication)  [2].  To  follow  we  will  follow  the  procedures 
proposed  in  [,3]  to  map  the  fast  algorithms  of  [1)  onto  some  systolic  architectures.  After  we  put  in  the 
initial  conditions,  the  results  will  rhythmically  pump  out  of  these  array  processors. 

This  correspondence  is  organized  as  follows.  We  begin  with  a  brief  review  of  the  fast  algorithms  of 
[1].  A  systolic  architecture  is  then  developed  to  implement  these  fast  algorithms.  The  array  structure 
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and  the  required  control  program  are  discussed.  Its  simplification  for  centrosymmetric  covariances  will 
also  be  addressed.  Finally,  we  conclude  this  paper  with  a  summary  and  future  perspective. 


II  SYSTOLIC  ARCHITECTURES  FOR  THE  NEW  SPLIT 

ALGORITHMS 


A.  Review  of  the  Xew  Split  Algorithms  of  [}] 

The  problem  considered  is  as  follows.  From  the  2i  —  1  noisy  observations  {l/i-i , y>-2- •  •  •  •  I/-(i-i)} 
of  a  zero-mean,  real-valued  discrete  random  process  compute  the  linear  least-squares  estimates 

of  X,  (forward  prediction)  and  (backward  prediction)  for  i  —  1. 1.5. 2,2.5, .. .. 

The  observation  {ijk}  are  related  to  the  process  {x*}  by  Pk  —  Uk,  where  is  a  zero-mean 
discrete-time  white  noise  process  with  unit  power,  and  {ijt}  and  {uk}  are  uncorrelated.  The  estimete 
of  X,  and  x_,  are  computed  from  the  observations  using 

1-1  1-1 

i,=  XI  h(i.j)yy,x-i=  X  f  =  1, 1.5.2, . . .  (1) 

The  prediction  filter  h{i.j)  are  computed  by  solving  the  following  Wiener-Hopf  equation  (for  i  = 
±1.±1.5.±2... .) 


1-1 

k'{i.j)-h{ij)+  X  j),  -(|z|  -  1)  <  j  <  |z|  -  1  (2) 

n=-(i-l) 


The  goal  of  [1]  is  to  derive  fast  algorithms  for  solving  (2)  when  k[i,j)  (=  £'[x,ij])  has  the  Toeplitz- 
plns-Hankel  structure,  i.e.  k{i,j)  =  ki(i  -  j)  +  k2{i  +  j).  For  a  [2Imax  -  1)‘^  order  linear  prediction 
problem  (from  i  =  -(/max  -  1 )  to  /max  —  1)>  the  overall  procedure  for  the  new  split  algorithm  of  [1] 
can  bt>  summarized  as  follows: 

1.  Initializations; 


J) 


/i(±l,0)  =  - 


k{±l,0) 

1  +fc(0,0) 


k(±^,j)  for  j  =  ±^,±1^,.. .  ,±(2/„ 


■^(±1./)  -  /±,./  +  /:(±l,/)-/i(±I,0)/r(0,/) 


for  j  —  ^  I 
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2.  Computation  of  the  non-local  potentials  V^s: 

Computing  V'/s  by  solving  the  following  2x2  simultaneous  equation: 


s(i-i,/-i)  5(-(i  -  i),2  -  i) 

sii  -  ^,i-  ^)  -  siij) 

5(j  -  -  ^))  s(-{i  -  i),-(t-  3)) 

.  . 

sii  -  i,-(i  -  i))  -  s(i,-i) 

3.  (a)  Generalized  Levinson  algorithm 
i.  Border  points 


(3) 


+  1  ,  -  i)  =  h{i,i  -  1)  -  V;‘;  hit  -(a  -  ^))  =  h(i. -a  -  D)  -  (4) 

ii.  Nonborder  points  (for  — (i  —  §)  <  i  <  (i  —  f)) 

h{i  +  ^,j)  =  h(iJ  +  ^)  +  h(i,j-^)-hii-^,j)  +  V,^h(i-^,j)  +  V,^h(-ii-^),j)  (5) 
(b)  Generalized  Schur  algorithm  (  for  i '  +  ^  <  j  <  2/max) 

s{)  +  ^J)  =  s{ij  +  •^)  +  siij  -  -  sii  -  +  V,^s{i  -  ^J)  +  V,^si-(i  -  i),  j)  (6) 


4.  Continue  Steps  2  to  3  from  (  =  1  to  /max  with  every  step  increment  ^ 


where  the  Schur  variable  sii,j)  =  6,,j  +  k{i,j)  —  h{i,j)  —  k(t,  n)k(n,j).  Note  that  from  (2), 

s(i,j>  =  0  if  IjI  <  |aj 

B.  Systolic  Array  for  the  above  Fast  Algorithm 

The  above  fast  algorithms  require  24/^„j.  multiplications  and  divisions,  and  48/^aj.  additions  and 
subtractions  [Ij.  To  follow,  we  propose  a  systolic  architecture  that  can  fully  exploit  the  inherent 
concurrency  (parallel  and  pipeline  processing)  of  this  algorithm  so  that  0(/max)  finte  complexity  can 
be  achieved. 

To  map  this  algorithm  onto  a  corresponding  array  processor,  we  follow  the  procedures  proposed  in 
[•3].  First,  a  DG  (dependence  graph)  is  established  in  Figure  1,  where  th**  shaded  regions  denote  the 
region  of  support  for  the  Schur  algorithm.  A  SFG  (signal  flow  graph)  can  then  be  derived  by  mapping 
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this  DG  along  some  feasible  direction.  In  the  sequel  we  choose  the  mapping  along  the  i  direction. 
Since  this  is  a  systolic  direction,  the  resulting  SFGs,  which  are  shown  in  Figures  2  and  3,  are  also  the 
desired  systolic  arrays. 

Figure  2  shows  the  array  processors  with  I61max  +  4  processing  elements  (PE’s)  that  implement 
the  generalized  Schur  algorithm,  while  Figure  3  shows  the  array  processors  with  SImax  ~  6  PE’s  that 
Implement  the  generalized  Levinson  algorithm.  The  overall  architecture  is  the  combination  of  both 
two  array  processors.  Figure  4  shows  the  operations  performed  in  the  right-hand  (i  >  0)  upper 
and  lower  PF.’s,  respectively.  Th'  left-hand  (i  <  0)  processing  units  are  the  same  except  that  the 
directions  of  are  reversed.  (Note  that  for  clarity,  the  transmission  of  are  not  shown  in  the 
array  processors  of  Figures  2  and  3.)  For  convenience,  the  array  processors  in  Figures  2  and  3  will  be 
referred  to  as  array  S  and  array  L,  respectively. 

The  initial  conditions  s(±^,  ±j),  s(±l,±/),  where  j  =  j,U, - 2!rnax-  \j'  =  1,2, 2/max- 1, 

and  h{±l,0)  are  put  in  the  array  S  and  array  L,  respectively,  befor<=  the  recursion  begins.  At  first 
stage  of  the  recursion,  the  potentials  V/s  are  computed  at  the  four  central  computing  units  in  array  S 
by  using  (3),  then  V'/s  are  sent  to  all  the  other  processing  units  in  array  S  to  update  s{i,j)  by  using 
(6).  and  to  array  L  to  update  h{i,j)  by  using  (4)  for  the  border  points  and  (5)  for  the  nonborder 
points. 

.After  completing  the  updating  p^-ocedures,  the  contents  in  the  array  S  (i.e.  s{i,j))  are  shifted 
centerward  by  one  unit  to  prepare  for  the  next  recursion.  The  recursion  continues  until  t  =  Imax  with 
the  step  of  A  in  each  recursion.  Note  that  in  the  updating  process,  the  processing  units  are  activated 
only  on  alternate  time  steps.  This  is  because  the  updating  equations  (5)  and  (6)  involve  the  variables 
of  the  previous  two  time  steps.  The  results  of  this  interleaving  update  after  each  time  step  are  shown 
in  Figures  2  and  3.  We  can  find  that  the  variables  indexed  with  integer  and  half-integer  “pop  up” 
alternately.  If  the  computation  of  the  non-local  potentials  in  (3)  requir'^s  time  interval  t\  and  the 
iinfiafo  pt.;.-  L.''*^ing  operations  require  time  interval  r2,  then  the  total  computing  time  complexity 


would  be  (2/mor  —  2)(ri  +  rj).  Note  that  since  the  recurrences  perform  in-place  computations,  only 


24/max  -  2  memory  units  are  required. 

The  undesired  global  transmission  (broadcasting)  of  the  non-local  potentials  V/s  (see  Figure  i)  can 
be  avoided  by  using  the  concept  of  computational  wavefront  proposed  in  [3],  in  which  the  operation 
performed  in  each  cell  is  triggered  by  the  availability  of  the  data,  instead  of  by  the  global  clock.  The 
updating  processes  are  finished  after  the  computational  wavefront  propagates  from  the  center  to  the 
right  (left)  end,  for  which  the  computing  time  becomes  (2/max  -  2)(ri  -I-  7-2)  -f  ImaxTi  by  assumption. 
The  extra  time  /max^'a  is  the  price  to  avoid  the  global  communication  scheme. 

A  program,  which  adopts  the  same  notations  used  in  [4]  and  summarized  the  above  procedures, 
is  shown  in  Figure  5.  This  control  program  is  broadcast  to  each  PE  before  the  arrays  begin  the 
recursion.  Note  that  further  simplifications  are  possible.  Since  the  arrays  S  and  L  perform  almost  the 
same  type  of  operations  with  complementary  support,  we  can  combine  both  arrays  into  a  single  one 
with  a  suitable  partition.  Also,  since  the  PE’s  are  only  active  at  alternate  time  step,  pairs  of  adjacent 
processing  units  can  be  combined  together  so  that  the  number  of  the  PE’s  can  be  reduced  by  one  half. 

If  we  solve  (2)  directly  using  the  Gaussian  elimination  procedure,  0(/max)  multiplications  and 
divisions,  and  G(/max)  rnemory  units  are  required  using  a  sequential  machine.  Furthermore,  this 
is  not  a  highly  parallelizable  procedure.  Merchant  and  Parks  provided  an  efficient  alternative  to 
compute  the  loeplitz-plus-Hankel  coefficient  matrix  system  of  equations  [5].  However,  their  approach 
is  to  reformulate  the  original  system  into  a  block-Toeplitz  system,  and  then  solve  it  by  the  multichannel 
Levinson  algorithm,  which  not  only  requires  much  more  complex  computations  (e.g.  matrix  inversion), 
but  also  needs  larger  data  bus  and  more  memory  space. 

C.  Simplification  of  the  Array  Structure  for  Centrosymmetric  Covariances 

In  the  special  case  that  k(i,j)  —  k{-i,-j),  i.e.  a  centrosymmetric  covariance  matrix,  we  have 
h{i,j)  =  h{-i,-j),s{i,j)  =  s{-i,-j),Vf  =  =  Vj,  [1]  .  Hence  the  arrays  for  t  <  0  can  be 

dispensed  with. 


# 
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Further  simplification  is  possible  if  we  define 


i{iJ)  =  -j)-  e(i,  j)  =  s{i,j)  +  s(r. -j),  V,  =  V','  + 


then  we  can  get  the  recursive  expressions  for  a(i,  j)  and  e(i,j),  respectively,  as  follows 

aii  +  =  a{ij  +  ^)  +  a(i,j  -  -  a(j  “  ^J)  + 

e{i  +  j)  =  e(i,  j  +  ^)  +  ~  ~  \'3)  +  -  \  j) 

and  the  new  non-local  potential  V,  can  be  computed  by 

-  e{i,i)]/e{i  “ 


Sim'larlv.  we  can  define 


a'iij)  i  h{i.j)  -  h{i,-j),  e'iij)  i  s(i,j)  -  1/*  i 


and  we  can  get  the  same  recurrences  for  a‘(i,j),  e‘‘(i,j)  and  V^"  as  (8), (9),  and  (10),  respectively. 

The  array  processors  for  solving  the  centrosymmetric  matrix  systems  are  shown  in  Figure  6,  where 
four  array  processors  are  constructed  to  update  a(i,j),a‘’{i,j),e{i,j),  and  e*(i,j)  respectively.  The 
operations  performed  in  each  PE  are  similar  to  those  of  Figure  4.  The  division  cells  (DIV)  are  used 
to  compute  the  non-local  potentials  V,  (V"-*),  which  are  then  used  to  update  a{i,j)  {a‘{i,j))  and 
e(z.j)  (e*(i,j)),  respectively.  The  resulting  h{i,j)  and  can  be  derived  by 

atz.j) +  a*(f,j)  a(i,j)- 

= - - - ;  h{i,-j)- - - -  (12) 

In  Figure  6,  2(2/,„ax  -  1)  PE’s  are  required  for  e{i,j)  and  and  2/max  PE’s  are  required  for 


a{i,j)  and  «*(  j,  j  ).  Note  ^hat  here  we  put  two  adjacent  points  and  (i  —  -  ^))  in  each  PE,  so 

the  overall  memorv  ref’i''’''’d  is  l(2/moi  “  l)i"4/r,,ax  —  ~  If  we  use  the  con,pIementary 

support  property  of  e{ij)  {e‘(i,j))  and  a{i,j)  ia‘{i,j)),  then  we  can  put  a{i,j)  (a*(i,j))  at  the  end 
of  arrays  e{i,j)  (c*(t,;))  and  u.se  only  2(2/max  -  I)  +  I  PE’s  plus  4(2/max  -  I)  +  2  memory  units. 
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If  the  division  requires  time  interval  and  the  update  plus  shifting  operations  require  time  interval 
72,  then  the  total  computing  time  complexity  would  hp  (2/mar  -  2)(r{  +  T2).  Again,  if  we  use  the 
data-driven  computational  wavefront,  then  the  computing  time  becomes  (2/max  -  2)(rj'  -f  T2)  +  /max'll 
by  assumption. 

Since  the  symmetric  Toeplitz  matrix  is  a  special  case  of  a  centrosymmetric  matrix,  v/e  can  compare 
this  array  architecture  with  those  proposed  for  solving  the  Toeplitz  system  of  equations  [6,  7],  We  hnd 
that  not  only  is  the  architecture  simpler,  but  also  the  overall  computational  time  is  reduced.  This  is 
not  surprising  because  we  are  concerned  with  a  linear  prediction  problem  which  has  specific  right-hand 
side  in  the  matrix  equation,  instead  of  solving  a  general  Toeplitz  system  of  equations,  which  requires 
the  inversion  of  a  Toeplitz  matrix  followed  by  a  back  substitution  operation.  Applying  our  proposed 
architecture  to  arbitrary  centrosymmetric  systems  of  equations  would  require  additional  processors 
for  the  back  substitution.  Nevertheless,  the  proposed  architecture  is  capable  of  solving  more  general 
problems  (applicable  to  arbitrary  Toeplitz- plus- Hankel  or  centrosymmetric  covariances)  than  those  of 
[6,  7]. 

Ill  CONCLUSION 

In  this  correspondence,  we  have  developed  a  systolic  architecture  to  implement  the  recently-developed 
fast  algorithms  of  [1]  to  compute  the  optimal  linear  least-squares  prediction  filters  for  arbitrary 
Toeplitz-plus-Hankel  covariances.  The  overall  time  complexity  for  computing  the  (2Imax  —  1)*^  order 
linear  prediction  filters  is  reduced  from  O(Imax)  *-0  0{lmax)  by  using  only  O(Imax)  PE’s  and  0{lmax) 
storage.  Some  issues  that  need  further  research  are  as  follows.  Modifications  of  the  above  systolic  ar¬ 
chitecture  sc  that  it  is  capable  of  solving  more  general  Toeplitz-plus-Hankel  coefficient  matrix  system 
of  equations.  Extension  of  this  architecture  to  the  2-D  counterpart  [8]  of  the  above  1-D  fast  algortihms. 
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FIGURE  HEADING 


1.  Figure  1:  Dependence  graph  for  the  1-D  generalized  Levinson  and  Schur  algorithms. 

2.  Figure  2:  Systolic  architecture  for  me  1-D  generalized  Schur  algorithm. 

3.  Figure  3:  Systolic  architecture  for  the  1-D  generalized  Levinson  algorithm. 

4.  Figure  4:  The  operations  performed  in  the  right-hand  upper  PE’s  (left)  and  lower  PE’s 
(right)  (e.xcept  the  boundary  PE’s  of  array  L). 

5.  Figure  5:  Program  performs  the  update  procedures  in  each  PE. 

6.  Figure  6:  Systolic  architecture  for  solving  the  centrosymmetric  matrix  system 


Figure  1:  Dependence  graph  for  the  1-D  generalized  Levinson  and  Schur  algorithms 
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Figure  2;  Sysiolic  architecture  for  the  1-D  generalized  Schur  algorithm 


Figure  3:  Systolic  architecture  for  the  1-D  generalized  Levinson  algorithm 
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Program  {  for  PE  j  and  the  prediction  filter  at  point  f  +  ^  } 

{  )  will  increase  by  at  the  end  of  each  recursion  } 
for  j  :=  -( i  -  V)  to  ( !  -  ^ )  do  beat  begin  {  i 's  an  integer  } 
receive  1'  from  lefl( right)  neighbor: 
ser  d  V’,  to  right(left)  neighbor; 

{  transmission  of  the  non-local  potential} 
if  j  is  a  half  integer,  then  begin 

if  PE  is  non- border  cell  (  j  ^  ±{i  -  ^))  then  do  equation  (5); 

else  do  equation  (4); 

end; 

else  {j  is  an  integer  } 

PE  do  nothing; 

beat  end: 

beat  begin  {  j  is  ?n  integer  }  {  i  is  a  haJf-integer  } 

{same  procedures  by  switching  the  role  between  ‘"integer”  and  “half-integer”  } 

beat  end; 

end; 


# 


Figure  5:  Program  performs  the  update  procedures  in  each  PE 
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Figiirp  fi:  Systolic  architecture  for  solving  the  centrosymmetric  matrix  system 
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APPENDIX  I 


Experimental  Results 


In  this  section,  we  provide  some  simple  simulation  results  by  applying  the  fast  algorithm  devel¬ 
oped  above  to  the  image  restoration  (smoothing)  and  coding  problems. 


1  Image  Restoration  and  Smoothing 


The  objective  of  the  image  restoration  and  smoothing  is  to  recover  the  original  image  from  a 
degraded  one  which  is  contaminated  by  some  sort  of  noise.  Here,  we  consider  the  most  common 
case  that  the  noise  is  the  additive  white  noise  and  the  method  employed  to  reduce  the  observation 
noise  is  the  linear  least  squares  prediction  or  smoothing. 

The  comparison  criterion  is  the  improvements  of  the  Signal-to-Noise  ratio  (ISNR),  which  is 
defined  as 


rSNR{dB)  =  10  log 
=  10  log 


_ average  signal  power _ 

average  power  of  prediction  error 
average  power  of  observation  error 
average  power  of  prediction  error 


-  10  log 


average  signal  power 


average  power  of  observation  error 


For  each  set  of  data,  four  types  of  algorithms  are  used  to  compute  the  resulting  ISNR.  These 
four  algorithms  include  :  Linear  Prediction  (LP),  Linear  Prediction  on  zero  mean  residues  (LPZM), 
Smoothing  (SM),  and  Smoothing  on  zero  meam  residues  (SMZM).  LP  is  to  use  the  fast  algorithm 
developed  to  compute  the  linear  prediction  filter,  and  SM  is  to  compute  the  smoothing  filter  by 
combining  the  LP  and  the  BSK  identity.  LPZM  (SMZM)  means  that  the  linear  prediction  (smooth¬ 
ing)  filter  is  applied  on  the  zero  mean  residues  which  are  derived  by  subtracting  the  global  mean 
from  the  original  signals.  For  simplicity,  the  observation  noise  is  the  white  noise  with  unit  power. 
The  prediction  coefficients  are  generated  by  assuming  that  the  covariance  function  has  the  form  as 
p'**  (p  =  0.995  »  1  and  r  is  the  distance  from  the  origin)  so  that  the  requirement  of  the  covariance 
having  Toeplitz-plus-Hankel  structure  is  satisfied. 

From  figures  (1)  to  (4),  four  different  isotropic  random  fields  are  generated.  The  covariance 
functions  for  these  four  isotropic  random  fields  are  4(0.82)'  for  figure  (1),  7(0.78)'  for  figure  (2), 
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M  (Ninafacr  of  Angnl*  Pan**) 


Figure  1:  The  Comparison  of  ISNR  for  Different  M  with  Covariance  Function  4(0.82) 


Figure  2:  The  Comparison  of  ISNR  for  Different  M  with  Covariance  Function  7(0.78) 
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Figure  5:  The  Comparison  of  ISNR  for  IDLP  and  LP  for  Uiiferent  M 

122*^  for  figure  (3)  ,  and  3.5A’i(r)  for  figure  (4)  respectively.  The  points  along  each  direction  are 
fixed  to  10,  and  ISNR  is  computed  for  different  Af  (number  of  angular  points). 

The  simulation  results  show  that  in  general  ISNR  improves  as  M  increases.  That’s  a  reasonable 
result  since  with  more  data  points  (information)  available,  we  can  get  a  more  accurate  prediction 
of  the  original  signal.  The  same  argument  can  also  be  applicable  to  the  result  that  ISNR  using  SM 
is  larger  than  that  using  LP.  The  latter  is  furtherly  supported  by  the  results  that  the  difference  of 
ISNR  for  LP  and  SM  becomes  larger  as  M  increases,  which  reflects  the  fact  that  the  data  points 
available  for  SM  are  propotional  to  M  so  that  the  the  difference  of  the  data  being  available  increase 
as  M  increases. 

The  ISNR  for  LPZM  (SMZM)  are  slightly  better  than  that  for  LP(SM)  in  small  M  and  are 
approximately  the  same  for  larger  M.  This  may  be  explained  that  when  only  small  amount  of 
data  are  available,  LPZM  (SMZM)  satisfy  the  zero  mean  assumption  and  produce  a  more  accurate 
prediction  (smoothing).  But  as  more  data  are  available,  the  data  generated  will  be  approximate 


4 


zero  meau  so  that  both  results  won’t  make  much  difference. 

It’s  worth  noting  that  the  simulation  results  in  figures  (3)  and  (4)  are  similar  to  those  of  the 
previous  figures  regar-Uess  of  the  mismatch  of  the  covariance  function.  This  striking  result  shows 
that  even  the  linear  prediction  (smoothing)  filters  are  generated  by  the  wrong  assumption  of  the 
covari«mce  functions,  the  resulting  ISNR  is  still  satisfactory  as  long  as  the  random  field  is  isotropic 
azid  highly  correlated,  which  happens  quite  often  in  the  practical  images.  The  ISNR  in  figure  (4)  is 
better  than  that  in  figure  (3)  because  the  covariance  function  in  figure  (4)  is  more  correlated  and 
does  not  decay  as  fast  as  that  of  figure  (3).  This  highly  correlated  covariance,  i.e.  p  ss  1,  is  the 
requirement  to  derive  the  above  fast  algorithm. 

The  results  of  the  LP  which  use  all  the  available  data  on  a  polar  raster  are  better  than  those  use 
only  the  data  on  the  same  line,  which  is  equivalent  to  l-D  linear  prediction  problem  (IDLP).  As 
shown  in  the  figure  (5),  the  ISNR  for  LP  is  always  luger  than  that  for  IDLP.  In  addition,  since  the 
linear  prediction  only  utilizes  the  previous  sample  in  the  IDLP,  the  ISNR  will  be  approximately  the 
same  for  all  M,  which  is  opposed  to  that  for  LP  (SM).  This  is  another  advantage  of  LP  over  IDLP. 
Although  the  algorithm  for  the  latter  is  faster  by  using  the  1-D  Levinson  algorithm,  however,  the 
performance  is  worse. 

These  simulation  results  confirm  our  claim  that  these  two  algorithms  (for  prediction  and  smooth¬ 
ing)  work  well  independent  of  the  value  of  Af ,  although  the  performance  gets  better  as  M  increases, 
i.e.  more  data  are  available. 
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2  Linear  Predictive  Coding  of  Images 

We  can  note  that  the  linear  predictive  coefficients  can  be  obtained  as  long  as  the  covariance 
function  is  available.  Therefore,  we  can  either  store  or  transmit  the  residues  of  the  data  instead  of 
the  data  itself  and  accompany  the  covariance  function  as  the  side  information.  Since  the  residues  are 
derived  by  subtracting  the  linear  combination  of  the  previous  data  from  the  present  data  to  reduce 
the  unnecessary  redundency,  hence  they  are  in  general  smaller  than  the  data  themselves.  Besides, 
in  many  cases  only  few  parameters,  e.g.  p  in  the  isotropic  random  field,  would  be  required  to  specify 
the  covariance  function.  Therefore,  the  overall  storage  requirement  can  be  reduced  significantly  in 
the  finite  precision  environment. 

We  take  the  previous  data  as  examples  by  considering  the  noisy  images  as  the  original  image 
and  the  prediction  errors  as  the  prediction  residues.  The  data  in  tables  (1)  and  (2)  are  the  same 
as  those  of  figures  (4)  and  (7)  respectively.  In  the  following  tables,  we  compare  the  average  signal 
power  and  the  resulting  prediction  residues  using  both  the  LP  and  IDLP. 

The  experiments  show  that  the  results  using  LP  always  provide  the  optimal  performance  and 
are  significantly  smaller  than  the  average  signal  power,  thereby  the  storage  requirements  can  be 
reduced.  It  must  be  emphasized  that  the  performance  depends  on  the  test  images.  The  results  get 
worse  when  there  are  large  variations  in  the  images  large,  e.g.  edges  or  lines.  This  is  the  limitation 
of  the  bach  least-square  method  which  takes  into  account  of  all  the  data,  so  that  the  results  can 
not  adapt  the  quick  change  sof  the  outside  environment.  The  above  result  can  also  be  regarded  as 
a  tradeoff  between  performance  and  complexity.  Although  complicated  algorithms  would  take  lots 
of  time,  it  would  also  provide  optimal  performance,  t.e.  require  minimal  storage  requirements. 
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M 

Average  Signal  Power  (dB) 

IDLP  (dB) 

LP  (dB) 

a 

3.73 

-1.04 

-6.22 

6 

3.69 

-1.16 

-8.16 

8 

3.70 

-1.55 

-8.32 

10 

3.69 

-0.52 

-8.56 

12 

3.68 

-1.24 

-8.59 

14 

3.68 

-0.88 

-9.04 

16 

3.68 

-1.15 

-9.89 

Table  1:  Comparison  of  Average  Signal  and  Residues  Power  Using  IDLP  and  LP  for  Different  M 
(Number  of  Angular  Points) 


M 

Average  Signal  Power  (dB) 

IDLP  (dB) 

LP  (dB 

B 

5.69 

-1.10 

-6.97 

6 

5.68 

-1.04 

-8.95 

8 

5.68 

-1.57 

-9.84 

10 

5.68 

-1.36 

-10.12 

12 

5.68 

-1.29 

-10.22 

14 

5.67 

-0.97 

-11.07 

16 

5.67 

-1.16 

- - J 

-12.03 

Table  2:  Comparison  of  Average  Signal  and  Residues  Power  Using  iDLP  and  LP  for  Different  M 
(Number  of  Angular  Points) 
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abstract 

.New  fast  algoiithms  for  solving  arbitrary  Toeplitz-phis- 
Hankel  systems  of  equations  are  presented.  The  algorithms 
are  analogues  of  the  split  Levinson  and  Schur  algorithms, 
althougl.  the  more  general  Toeplitz-plus-Hankel  structure 
requires  that  the  algorithms  be  based  on  a  four  term  recur¬ 
rence:  relations  with  previous  split  algorithms  are  noted. 
The  algorithms  require  roughly  half  as  numv  multiplica¬ 
tions  as  previous  fast  algorithms  for  Toeplitz-plus-Hankel 
systems. 

I.  INTRODUCTION 

Toeplitz-plus-Hankel  (TH)  sy-tems  of  equations  have 
many  important  applications,  such  a«  linear  prediction  for 
nonstationary  processes  with  TH  rm-ariances.  two-sided  au¬ 
toregressive  spectral  estimation  [I],  linear-phase  prediction 
filter  design  [2],  Hildeorand-Prony  spectral  line  estimation 
procedure  [3].  and  P.4DE  approxiiuaiion  to  the  cosine  se¬ 
ries  expansion  of  an  even  fund  ion  |-1).  Integral  equations 
with  a  TH  kernel  arises  in  atinos|i!ieiir  scattering  [5]  and 
rarefied  gas  dynamics  [6J. 

Fa.st  algorithms  for  TH  systems  lia\e  apjieared  in  (7]- 
[9].  The  new  algorithms  of  this  paper  can  be  viewed  as 
split  versions  of  those  of  [8],  or  as  k,eneializations  of  the  sjilit 
algorithms  of  [10]  from  symmet.ic  Toepli'z  to  arbitrary  TH 
systems. 

The  heart  of  the  new  algorifhiiis  is  a  four-term  recur¬ 
rence  that  generalizes  the  three-term  recurrences  of  (10)  to 
TH  matrices.  This  r  ecurrence  re(|uiies  two  .nultiplications 
per  update,  half  the  number  reipiired  by  the  algorithms 
of  (7;-[9j  This  is  anaiogous  to  the  o0%  savings  in  multi¬ 
plications  for  the  split  algorithms  of  (10)  over  the  classical 
Levinson  and  Schur  edgorithms. 

11  DERIVATION  OF  FOUR-TERM  RECURRENCE 
A  The  Basic  Problem 


VVe  consider  the  solution  of  the  TH  system 


“I-*-*-.-.  • 

i  0 

s..,-.  s,..: 

0  0 

tp.o 

s 

0  0 

■■  l-t-ip.p 

0  1 

,  5...,  5..,  , 

(1) 


where  the  S±,,±,  are  defined  from  the  (k,  j}  and  (/ti.j)  in 
(15)  below,  and  the  i;""  element  of  the  system  matrix  has 
the  form 

=  *^l(l  -  J  )  +  +  j)  G) 

for  arbitrary  functions  l:i(')  and  i  .Note  in  jiartmuiar 
that  the  system  matrix  need  be  iieiihei  symmetric  nor  pcr- 
symmetric;  the  only  requirement  is  that  all  of  the  cential 
submatrices  be  nonsingular. 

Updating  (1)  from  i  to  i-(-l  iiiciea.ses  the  size  of  the  nia- 
trix  by  two:  this  requires  two  updates,  and  requires  k-,^2  ,ri 
be  defined  at  half-integer  values  ( i  /  2.  ;  /  2 ).  If  i/2-t-j  /  2  is  not 
an  integer,  let  fc,/2,j/2  =  0;  if  i/2  -t-  )  !2  is  an  integer,  assign 
ki/2,j/2  such  that  the  matrix  with  //"’  coordinate  k\/2.j/2 
TH.  If  k,  j  is  specifisd  by  the  form  (  2 ).  this  can  be  clone  eius- 
ily  by  inserting  the  half-integer  values  in  the  functions  k^i  ) 
and  k2(  )  (note  that  the  arguments  will  always  be  integers). 

Omitting  the  first  and  la.st  row  s  of  ( 1 )  allows  it  to  be 
rewritten  as 

i-i 

0  =  k,,j  +  h,,,+  Yi  -it-1)  <  ;  <  t-1.  i3) 

n  =  -(i-l) 

Now  define  the  interpolated  system  of  (3)  a.s 

,  - 1 

0  =  <t|-H/2.2-H/2  +  ^l-l-l/2.;-H/Z+  Yi  ^'l-l-l/z  n*‘n,;-H/2 

PI  =  -  (  I-  I  /2) 

14) 

and  similarly  for  -i  -  1/2.  The  interpolated  systems  for 
various  cr'^ers  are  auxiliary  systems  of  TH  systems  that 
are  solved  along  with  (3)  by  the  algorithms  to  follow.  This 
artifice  is  necessary  in  order  to  obtain  split  algorithms  solv¬ 
ing  nested  systems. 

B.  Derivation  of  Foitr-Term  Recurrence  for  h,  j 

To  make  the  derivation  easier  to  follow,  we  consider 
only  positive  «.  Define  the  discrete  wave  operator  A  of  a 
function  fij  as 

^fi.j  =  /i-I-I/2,2  +  /■-1/2.;  -  /..j-H/2  -  ft.j-in  G) 

A  is  the  discrete  version  of  the  continuous  operator  ( ^  — 
^).  Note  that  the  TH  structure  (2)  is  egvivaient  to 
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Ait.j  =  0;  for%ntt§trx  +  >■ 


(6) 

Apply  the  operator  A  to  (3)  by  writing  (3)  with  i  re¬ 
placed  with  I  ±  1  /2,  and  then  j  replaced  with  j  ±  1/2,  and 
then  adding  and  subtracting  (4|  a|)iiropriately.  Using  (.")), 
(6i.  the  non-singularity  of  (11,  and  'nine  algebra  gives  [16] 

+  r  ~i '  -3/2)  <  j  <  1—3/2 

(7) 

where  we  have  defined  the  potentml^  [Hi 

^  =  ^i+I/2,i-l/2  -  1  ^  =  ^',+  1/2. -1 1-1/2)  -  *..-1,-1). 

(8) 

Eq,  (7)  can  be  written  as 

*1+1/2,;  =  *i.;+l/2+*..;-l/2  +  ( -  '  >*,-1/2  j  + ''’.^* -1 .- I /2 1  ; 

(9) 

This  is  the  four-term  recurrence  ai  the  heart  of  the  new 
algorithms.  It  is  analogous  to  the  iluee-ferm  recurrence  on 
which  the  split  algorithms  of  [10]  are  based,  although  there 
are  some  differences  (see  Section  \'I). 

Ill,  NEW  SPLIT  LEVINSON  ALGORITHM 

The  four-term  recurrence  (9|  ran  be  propagated  in  in¬ 
creasing  hi  and  -(]i|  -  3/2)  <  ,/  <  ji]  -  3/2.  Note  that 
for  I  an  integer/half-integer,  j  will  take  half-integer/integer 
values,  respectively.  However,  siine  (9)  does  not  hold  for 
J  =  ±(i  -  1),  we  must  update  *,,:tii-ii  using  (8),  and  simi¬ 
larly  for  *-,.±(,-1).  Also,  (8)  and  (9)  require  and  to 
be  supplied  separately,  computed  from  note  that  (8) 
cannot  be  used  to  compute  V','  and  since  (8)  is  needed 
to  update  *±,,a;(,_i).  We  now  show  how  V','  and  can  be 
computed  from  previously  computed  h,  j  and  k, 

A.  Computation  o/ V''  and  1'/ 

Setting  J  =  1-1  in  (3)  and  (4)  gives 

1-1/2 

*1+1/21-1/2  =  -*,+  1/2. 1-1/2  ~  53  *1+1/2, n*n. 1-1/2 

>!=-(, -1/2) 

(10a) 

I-  I 

*,.,-1  =  -  53  *i.'>*",*-i-  (10*) 

n= -  ( I  - I ) 

Eq.  (10b)  requires  only  (known)  and  h,  j  (from  the 
previous  recursion);  however,  (10a)  requires  which 

ha.<-  not  yet  been  computed.  Substituting  (19)  into  (10a) 
and  much  algebra  results  in  the  following.  Define  the  Sekur 
variables 

1-1 

5,. J  -4-  53  J-  J  =  ±'-  (11) 

n9B  -{ I-  n 

Note  S,,j  can  be  computed  from  known  t,.;  and  A,.;.  Then 
it  may  be  shown  that 

5.-i/J,-(.-i/l)  — on 


The  existence  of  a  unique  solution  to  ( 121,  which  can  easily 
be  found  in  closed  form,  is  proved  in  Section  V  below 

B.  New  Split  Levinson  Algorithm 

Initialization:  *±),o  =  -l±i.ii.  <l  +  I'o.ul 
Computation  o/V'* ,  V'^.  Coniimtc  S,,±,  from  i  ,  ^  ( knowii 
and  h,  J  (from  previous  recursion  )  using  ill).  Compute  ' 
and  V'^  from  S,  ±,  and  S,_i  u-mg  (12) 

Update  h,  j:  Compute  *±|,+  i/;i  ±i,-i/2)  u.sing  (S) 
Compute  *,+  1/2,;,  l/i  <  I'  -  3 '2  I  using  (9) 

Compute  A_(,4.i)  J  similarly  using  (  7 ) 

At  this  point  the  recursion  is  rninplcte.  The  computed 
*,.;  for  integer/  htilf-integer  t  and  /  solve  the  original  s\s- 
tem  (3)/interpolated  system  (4).  icspecnvely;  note  that  two 
recursions  are  needed  to  increase  the  si7c  of  the  system  i3) 
by  two  (i.e.,  update  t  to  i  4-  1 ) 

This  algorithm  differs  from  the  split  Levinson  algo¬ 
rithm  of  [lO]  in  two  respects.  First,  the  non  symnieti ic  TH 
system  matrix  requires  four  sequences  and  of  po¬ 
tentials  and  the  four-term  rectirrence  ( 13)  The  symmetric 
Toeplitz  system  matrix  solved  by  the  split  Levinson  algo¬ 
rithm  of  [10]  requires  only  one  sequence  oi  potentials  and 
a  three-term  recurrence.  Second,  the  split  Levinson  algo¬ 
rithm  of  [10]  propagates  not  but  k,  j  +  *<,-;;  this  is 
more  efficient  lot  symmetric  Toeplitz  matrices,  but  requires 
recovery  of  h,_j  from  *,,j  4-  *,._;  at  termination. 

IV.  NEW  SPLIT  SCHUR  ALGORITHM 

The  “inner  product"  (11)  is  a  computational  bottle¬ 
neck,  as  in  the  classical  Levinson  algorithm.  We  now  derive 
a  new  split  Schur-type  algorithm  for  arbitrary  TH  matrices. 
This  algorithm  can  be  propagate<l  in  parallel  with  the  split 
Levinson  algorithm  derived  above:  this  avoids  the  compu¬ 
tational  bottleneck  (11).  The  same  idea  was  used  for  the 
classical  Schur  and  Levinson  algorithms  in  [12] 

The  first  step  is  to  show  that  the  forward  prediction 
error  filter  satisfies  the  four-term  iccurrence  (9).  From  thi.s. 
we  show  that  the  5,.;  defined  in  ( 1 1  )  ( now  for  all  j  >  i )  also 
satisfy  (9).  Then  (9),  initialized  U'lng  k,  j.  can  be  used  to 
compute  V,'  and  V*  quickly. 

A.  Four- Term  Recurrence  for  S,  , 

Define  as  the  forward  prediction  error  filter  O,.;  = 
*j.;  +  *i.;-  Clearly  d>i.i  satisfies  (9)  for  -(i  -  3/2)  <  ;  < 

I  —  3/2  since  =  *,,;  for  these  values.  At  j  =  ±(i  -  1/2) 
or  ±(t-!- 1/2)  </>,,]  satisfies  (9),  since  this  reduces  to  (8)  .And 
for  bl  ^  •  +  3/2  (9)  reduces  to  0  =  0  Hence  (9)  with  *, ; 
replaced  with  d,.;  is  true  for  all  i  an  integer/half-integer 
and  j  a  half-integer/integer: 

^•+1/2,;  =  ^i.;+l/2+'^i,;-I/2+(''’,'  "  1  )0,- l /2.; +  | /2).; 

(13) 

Next,  extend  the  definition  S,.j  in  (11)  to  all  integm 
and  half-integers  i  and  j  such  that  i  +  j  is  an  integer.  Fn>ni 
(3)  and  (4)  Sgy  =  0  for  — (i  —  1)  <./<'-  1,  and 

Si,j  =  ^(*l,n  -h  *<,»)(*»,;  +*«.;)  =  53  4-  I’ll/  -  ”) 


[  5.-,/i..-t/i - 
l  '?  ]  "  I  5,-1 /!,-(. -l/JI 
(121 


5.-1 /  I. I- 1 /I  ~ 
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-  <?|,;  +  C.  J  •  '  /  '  +  0|  -  J  •  J  ^  ' 

where  *  denotes  a  convolution  in  / 

Since  il3i  IS  linear  in  fiiiirii..ii-  of  it  may  be  con¬ 
volved  with  *ii;  I  Addine  -  ISi  the  convolution  of  (13) 
w'.tb.  *i'i  !  '  and  the  convoUuion  of  die  time-reversal  of  (l3l 

'Aith  K^.  j  I  and  using  (  14j  gives 

-S’.-.!  2j  =  ' -^i  -  I  /  J  ^  -S'- ( ,  -  1 /2). 

(15) 

Hence  5,  j  also  satisfies  the  four-term  recurrence  (9). 

B  Sew  Split  Schur  Algonthfi 

Initialization:  Sq  ,  =  kq  j.  =  ^±\ )2.j  +  \  /2 

Computation  of  ,  V‘^  Compute  V'*  and  1'^  from 
S.  i,  and  5,_i  2  ^(,_i/2i  using  (12l  Similar  equations  are 
usea  to  compute  Idl,  and  V'3, 

Update  5,  I  5  '  using  (  15 1 

At  this  point  the  recursion  is  complete  The  split  Schur 
algorithm  can  be  run  in  parallel  with  the  split  Levinson 
aigniithrii,  supplying  the  potential-.  anil  I’j,  winle  liv- 
pas-irig  t  he  "inner  product"  comi  lu'  at  imi  i  1 1  i  ((121  is  si  ill 
neresiarv a.s  suggested  m  12  foi  'he  classical  algorithnis 
Note  I'i- and  I'l  S'l-*!  2  fot  uii'  ni'i  m  and  half-integer 
n  -  i  2  uniquely  determines  i,  f..-  .-li.'  i,;.  i  a-  an  in¬ 
teger,  using  '  C  ' 

If  the  original  system  .3'  is  a  di'cretizai  ,on  of  an  inte¬ 
gral  equation,  then  S,  ;  <  <  1  and  ' iie  c,  ^  m  i  1 1 )  dominates 
the  other  te  ms  if  i  =  7  In  this  ca-i  'he  solution  to  1  12  is 
sin. ply  =  S,_i  ,-]  -  5,  ,  and  I  "  =  S,.-^  -,,-n  ~  S, 

V  SOLUTION  OF  AUDITriAnN  TH  SYSTEMS 

The  .split  algorithms  al.inve  -..I'f  the  sv.stenis  i3.'  ami 
4  t:  hence  they  also  solve  .  1  1  wit  ii  s  . ,  . ,  defined  as  111  >  1  '  1 
We  now  consider  the  general  luoi.liin 


L  -  1  - V..J  L  'I  j  L  *>. 


The  2x2  systems  (12)  and  (17)  have  unique  solutions 
if  the  central  submatrices  of  the  system  matrix  ( 1 )  are  non- 
singuiar.  To  see  this,  suppose  that  the  2  x  2  system  matrix 
in  (12)  and  (17)  is  singular.  Then  the  second  colunm  is 
a  multiple  (say  m)  of  the  first  column,  and  the  colunm 
vector  (1,  -  ■  •  ,(h-i,2  -  mh,,j),  •  •  ,  solves  the  homoge¬ 

neous  system  associated  with  (1|,  which  is  impiossible  as 
long  as  the  system  matrix  in  ( 1 )  is  noosingular. 

VI.  RELATION  WITH  PREN'IOUS  SPLIT  ALGORITHMS 

A.  Relation  to  the  Split  Algonthmf  of  [10] 

To  show  how  the  new  algorithms  reduce  to  the  sjilit 
algorithms  of  [10),  we  first  consider  the  class  ofTH  matrices 
such  that  k,  j  =  It-,,-;  In  term'  of  r2)  both  '  .  '  and 

jt2(-)  are  even  functions;  note  that  rm-ariance  functions  of 
time- reversible  random  processes  have  this  property.  The 
set  of  cenlTo symmetric  matrices  iiiiainces  that  are  both 
persymmetric  it,,j  =  k^,-,  and  symmetric  k,  ,  =  kj  ,\  is 
a  subset  of  this  class.  From  (3)  h ,  ,  =  fi_, -j,  from  (11) 
S, ;  =  and  from  (12t  V','  =  l2,.  and  I  =  •  ', 

Hence  the  computations  for  1  <  0  aie  all  unnecessary 
VVe  can  go  further  Defining 

=  fl,  .  -h,  e.;  -  S,  ;  *  S',  .  ,  V,  =  r,’  - 

19.) 

replacing  j  with  -7  in  |  7  j  and  (  1  7  1  and  adding  to  1  ,  '  and 
(16i  respectively  results  in 

An,.;  =  r,o,-i  ;,  A'  ,  ,  =  I  "20) 

■Adding  the  two  equations  of  ( 12  '  allows  I  ,  to  he  cumpute<t 
from  e,  ,  by 

r,  =  le.-,  -  I,  I  ,-i  U2l! 

From  (3)  and  (10  '  a,  ;  is  the  sohuion  to 

.  -  i 

L-,  ;  4- L,.-;  =  a,  ;  a-  ^  (I.nkn.j  t -2 ) 


where  the  right  side  is  now  ariutian 

Define  (C;.-i  £  7  £  1)  reciiisiwh'  as  follows  Let  ,  j.; 
bf*  tijp  solution  fo  the  2  '  2  svsf<iit 


'S'  9  ■' 

V''-‘ 

—  i.=  -l;-l 

,c„5„-,- 

1  5.,,  S,;  1 

i r  i  ^ 

l-iiSi.,; 

(  1 

7) 

Thru  the  soiutKin 

to  1  1C  1  IS  gl'Cl, 

1  J\ 

J;  = 

t 

^  CnOn 

'£/<'- 

(18) 

ft  *  —  I 


These  equations  may  be  denved  easily  by  taking  linear  com¬ 
binations  (weighted  by  the  c±,)  of  the  column*  of  (1)  for 
increasing  1  and  equating  to  (16)  Noie  how  this  relie*  on 
the  split  algorithms  solving  nested  system*  of  equation*  a* 
1  increases 


The  solution  to  (22)  can  be  recursively  computed  using  the 
three-term  recurrence*  (20),  along  with  (21)  These  equa¬ 
tions  have  virtually  the  same  form  a*  the  split  algorithms 
of  [10|,  even  though  k, ,  u  not  Torflitt. 

To  see  what  is  happening  here,  use  (2)  to  rewrite  l^'e 
left  side  of  ( 22 )  as 

It.,,  -f  =  t,  (1  -  ;  )  +  kjl  1  4-  / 1  +  ki  ( I  +  J  )  +  fc2(  ’  -  7  ) 

=  kii  -  j)-k  k{,  +  j)  (23) 

where  k{i)  =  iti(i)  +  tj(i).  From  (19)  a,,,  =  and 

the  right  side  of  (22)  can  be  rewritten  usir.,  this  and  (23), 
yielding 

1-1 

k(t-})-kk{fkj)  =  a,  ;  +  52"'  nlH't  -7)-*-fc(n  +  7)).  (24) 

fi»0 


This  IS  the  symmetric  Toephtz  sy-tem  ■solved  by  the  split 
tilgorlthms  of  [10!,  after  shifting  fiom  a  one-sided  to  a  two- 
siaed  interval  This  siu'ws  how  the-e  algorithms  are  related 
to  the  aleorithms  t/  paper  Note  rhat  the  .‘•plit  ajgrv- 
n'.hnis  of  10[  propagate  a,  not  h,  ,,  k,  ,  must  be  com¬ 
puted  from  (1,  ,  at  the  end 

\Ii  CONCLI  sKjN 

New  last  aig' air  lirii-,  fiai-'e  re*  n  deii-i'd  for  solving  ai - 
oun.r;.  TH  systems  of  I'Mialioris  Tli<  new  algorithms  c  an 
rje  vieweil  a.s  Hnalogi,es  of  the  s|,iit  Le\  inson  and  Schiir 
algorithms  of  10;.  but  applicable  to  a  more  general  prob¬ 
lem  The  split  Levinson  algorithm  lec-iirsively  computes  the 
solution  using  a  four-term  recurrence,  but  reepures  a  non- 
parailelizable  computation  (11)  to  compute  the  potentials. 
The  spill  Schur  algorithm  computes  the  potentials  using  a 
similar  four- term  recurrence:  using  it  in  parallel  with  the 
split  Levinson  algontlim  obviates  ill)  and  allows  the  same 
processor  architecture  to  be  used  for  both  algonthms. 
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ABSTRACT 

Sew  discrete  generedized  split  Levinson  and  Schur  algo¬ 
rithms  for  the  two-dimensional  linear  letusl-squares  predic¬ 
tion  problem  on  a  polar  raster  are  derived.  The  algorithms 
compote  the  prediction  filter  for  estimating  a  random  field 
it  the  edge  of  a  disk,  from  noisy  observations  inside  the  disk. 
The  covariance  function  of  the  random  field  is  assumed  to 
have  a  Toeplitz-plus- Hankel  structure  for  both  its  radial  part 
ind  its  transverse  part.  This  assumption  can  be  shown  to 
be  closely  related  with  some  types  of  random  fields,  such  as 
isotropic  random  fields.  The  algorithms  generalize  the  split 
Levinson  and  Schur  algorithms  in  two  ways:  (1)  to  two  di- 
mensio  is:  and  (2)  to  Toeplitz-plus-Hankel  covariances. 

I  INTRODUCTION 

The  problem  of  computing  linear  least-squares  estimates  of 
two-dimensional  random  fields  from  noisy  observations  has 
many  applications  in  image  processing.  In  particular,  the 
two-dimensional  discrete  linear  prediction  problem  is  a  useful 
formulation  of  problems  in  smoothing  and  image  coding  and 
re3toration[lj. 

If  the  random  field:  ( 1 )  is  defined  on  a  rectangular  lat¬ 
tice  of  points:  (2)  is  stationary;  and  (3)  has  quarter-plane  or 
asvmmetric  half-plane  casuality.  then  the  two-dimensional 
linear  prediction  problem  may  he  solved  using  ih"  n-.v.'ti- 
channei  Levinson  algorithm  [2,3,4]. 

However,  in  some  medical  imaging  problems,  and  in  spot¬ 
light  synthetic  aperture  radar,  data  are  collected  on  a  polar 
raster  of  points,  rather  than  on  a  rectangular  lattice.  Al¬ 
though  such  data  can  be  interpolated  onto  a  rectangular  lat¬ 
tice.  thi.s  IS  necessarily  inexact;  it  also  affects  the  covariance 
function.  For  restoring  noisy  images,  image  coding,  etc.,  it 
is  clearly  desirable  to  develop  analogues  of  the  multichannel 
Levinson  and  Schur  algorifhms  applicable  to  discrete  random 
fields  defined  on  a  polar  raster. 

This  paper  develops  these  analogues.  They  generadize 
previous  results  in  three  wavs;  (1)  the  random  field  is  de¬ 
fined  on  a  polar  raster;  (2)  the  random  field  is  not  required 
fo  be  stationary;  rather,  its  covariance  must  have  Toeplitz- 
plus-Hankel  structure  in  both  the  radial  and  transverse  di¬ 
rections;  and  (3)  the  quarter-plane  or  asymmetric  hall  plane 
causality  assumption  is  replaced  by  a  more  natural  causalitv 


in  the  radiid  direction  only;  the  prediction  filters  estimate 
the  random  field  at  a  given  point  using  observations  from  all 
points  of  smaller  radius.  The  algorithms  are  generalizations 
of  the  split  algorithms  [5.6] 

This  paper  is  organized  as  follows.  In  Section  11,  the 
two-dimensional  analogues  of  the  discrete  split  Levinson  re¬ 
currence  and  split  Schur  recurrence  for  the  linear  prediction 
problem  on  a  polar  raster  are  derived.  The  derivation  is 
based  on  the  assumption  that  both  the  radial  part  and  the 
transverse  part  of  the  covariance  have  Toeplitz-plus-Hankel 
structure.  In  Section  III,  an  isotropic  random  field  is  shown 
to  have  a  Toeplitz-plus-Hankel  covariance,  the  overall  com¬ 
plexity  of  the  proposed  algorithm  is  evaluated,  and  compar¬ 
isons  with  the  result  of  [7]  are  made.  Section  IV'  concludes 
with  a  summary  and  a  discussion  of  how  the  results  of  this 
paper  can  be  used  to  solve  the  general  smoothing  problem. 

II  DERIVATION  OF  THE  RECURRENCE 

A.  Basic  Problem 

The  problem  considered  is  as  follows.  From  noisy  ob¬ 
servations  {s/i./v}  of  a  zero- mean  real- valued  discrete  ran¬ 
dom  field  {x,,/v}  at  the  points  (i,  N)  of  a  polar  raster  on 
a  disk,  compute  the  linear  least-squares  estimate  of  i,,v  for 
all  points  on  the  edge  of  the  disk.  Here  t  is  an  integer  ra¬ 
dius  from  the  origin,  and  N  is  the  integer  index  of  the  argu- 
nient(angle);  if  there  are  M  points  distributed  on  the  circle 
of  any  radius,  then  {i,N)  is  the  point  at  radius  i  and  angle 
2rrNIM. 

The  observations  {y,,/v}  are  related  to  the  field  i,jv  by 
j/i.AT  =  x,./v  4-  Vi,N<  where  (u,,;v}  is  a  zero-mean  discrete 
white  noise  field  with  unit  power,  and  {x,,n}  and  {u.jv} 
are  uncorrelated.  The  covariance  of  {x,jv},  = 

j)  is  assumed  to  be  a  non-negative  definite  func¬ 
tion  with  Toeplitz-plus-Hankel  structure  in  both  arguments. 
The  estimates  of  x,_n  at  the  edge  of  the  disk  arc  computed 
from  the  observations  {5/,,^}  using 

The  optimal  prediction  filters  A(i,  Ni,j,  Afj)  are  computed 
by  solving  the  two-dimensional  discrete  Wiener-Hopf  equa¬ 
tion 
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K(t,Nuj,N2)  =  h{i,Ni;j,N2) 

+  £  h{i,Nun,N3)K(n,Ny,j,N2)  (2) 

-  =  -(.-l).V3  =  I 

for  all  -(i  —  1 )  <  j  <  1  —  1  and  1  <  A^i ,  Aj  <  A/.  The  goal  is 
to  derive  a  fast  algorithm  for  solving  (2)  when  K(i,Nx;jy.\'-i) 
has  the  Toeplitz-plus-Hankel  structure  shown  (5)  and  (6) 
below. 

We  decompose  the  update  procedure  into  two  steps  by 
introducing  an  interpolated  (auxiliary)  system.  As  shown  in 
Figure  I,  between  every  pair  of  points  in  the  radial  direc¬ 
tion,  we  insert  an  auxiliary  point.  The  covariance  function 
K{i,Ni;j,N2)  is  interpolated  at  these  auxiliary  points  such 
that  the  block  ToepUtz-plus-Hankel  structure  (see  (5), (6)) 
is  mainttiined.  Then  the  prediction  filter  can  be  defined  at 
the  interpolated  points  as  the  solution  to  the  interpolated 
system,  which  has  the  form  of  (2)  but  is  specified  on  the 
interpolated  points. 

B.  Derivation  of  the  Levinson-Like  Recurrence 


Define  the  discrete  wave  operators  A,  and  A*  by 
A,/(.,AV.J,fV5)  =  f(i  +  +  /(i  -  i./Vf.j.Afz) 


-  -  KhNiJ  -  ^,N2)  (3) 

As/(.,  IV,;;,  N2)  =  /(.  -  +  1));  j, ((fVj))) 

+/(‘-5.((A'.-i));j.((^5)))-/(«-^.((^i));j,((iVa+i))) 
-/(«-^,((^i));;,((^z-i)))  (4) 


where  A,  and  A»  can  be  regarded  tis  discrete  versions  of  the 
continuous  operators  (^  -  5^)  and  (^  -  5^)  for  the 
radial  part  and  transverse  p^st,  respectively,  and  ((^■))  means 
a  mod  M  operation. 

We  assume  that  the  covariance  function  has  the  block 
Toeplitz-plus-Hankel  structure 


for  all  -(t  -  f)  <;<(»-  |)  and  1  <  A,,  A,  <  Af.  Here 
we  have  defined  the  potentials 

=  -[*('  +  A^;>  -  ^.A'j)  -  /i(i..V.;i  -  l,.Vj)j 

1  ; 

K  ■  ( M .  A  2 )  =  -  [  A  ( r  +  - ,  A, ;  - 1  +  - ,  Aj )  -  A  ( : .  A, ;  - ,  -hi . .  V, )  1 

(9) 

Equation  (7)  is  the  basic  recurrence  that  is  the  heart  of 
the  Levinson-like  algorithm.  The  left  side  is  the  difference  of 
two  two-dimensional  discrete  Laplacian  operators,  analogous 
to  the  difference  of  one-dimensional  discrete  Laplacian  oper¬ 
ators  appearing  in  the  split  algorithms  of  [5j.  The  right  side 
generalizes  the  three-term  recurrence  in  [5]  to  a  multi-term 
recurrence;  this  is  analogous  to  the  matrix  recurrence  in  [6]. 
However,  it  is  applicable  to  non-symmetric  block  Toephtz- 
plus-Hankel  systems  (see  [8]). 

When  i  is  an  integer  and  ;  is  a  half-integer,  equation  (7) 
will  update  h  from  the  real  points  to  the  interpolated  points. 
When  i  is  a  half-integer  and  ;  is  an  integer,  equation  (7)  will 
update  h  from  the  interpolated  points  to  the  real  points. 


C.  Derivation  of  the  Schur-Like  Recurrence 


We  still  need  to  calculate  the  potentials  V*(Ni,  Aj)  and 
V,~(  Ai,  A2)  at  the  beginning  of  every  update  so  that  we  can 
use  the  recursive  formula  (7).  Since  an  inner  product  is  a  bot¬ 
tle  neck  in  a  parallel  processing  environment,  we  overcome 
this  difficulty  by  introducing  the  Schur  variables  (defined  at 
integer  and  half-integer  points) 


s(i.  A,;;,  Aj)  =  +  A(i,  A,;;,  Aj)  —  /i(«.  A,;;,  Aj) 

-EE  h{i,Ni;n,N3)Kln,N3;j,N2)  (10) 

ns— (t—l)  Njsl 

where  =  0  unless  «  =  j  and  A,  =  Aj,  in  which  case 

it  is  unity. 

Since  the  Schur  variables  are  the  linear  combinations  of 
the  prediction  error  filters  —  A(i,  A,; ;,  Aj),  equations 

(7)-(8)  show  that  s(»,A,;;,A2)  satisfies  the  recurrence  (7), 
but  now  for  all  j: 


AsA'(.,A,;;,A3)  =  0  (5) 

A,A(.,A,;;,A,)  =  0  (6) 

Some  examples  satisfying  (5)  and  (6)  can  be  found  in  (8|. 
Applying  the  Laplacian  operator  A  =  A,  -f  A#  to  the 
equation  (2),  we  have  after  some  algebra  (8) 

h{t  +  i.  A, ;;,  Aj)  =  h(i.  A,;;  +  ^,  Aj)  -1-  h{i.  A,;;  -  Aj) 

-A(»-5.A,;;,A2)+A(«-^,Af.;7,Aj-)-l)-(-/»(.-i  A,:;,A,-1) 

1  1  ^ 

-Hi  +  1;>,  A,)  -  h(i  1;>,  A,)  -F 

^  ^  Afj=I 

[^■••(A,,  A3)A(.-^,  As;;,  A,)-m-(A„  A3)A(-(i-i),  A,;;,  A,)l 

(7) 


«{»  +  Ai;;,  Aj)  =  s(t,  Ai;;  -I-  +  a(»\  A,;;  -  Aj) 

A,;;,  A2)-I-s(i— Ai;;,  Aj-l-l)-|-s(«--,  A,;;,  Aj-1) 
-«(«■  -  i  A,  -F  1;;,  As)  -  s(i  -  ^  A,  -  1;;,  A,)  -F 

A,, A3)s(i-i  A3;;, A3)+Vr( A„ A3)s(-(i-k A3;;, A,) 

(11) 

Equation  (11)  is  the  basic  recurrence  for  the  Schur-Uke 
algorithm;  for  — (i  —  1)  <  J  <  (*  —  1),  s(«,  Ai;;,  Aj)  =  0  by 
(2). 

Setting  ;■  =  (i  -  i)  and  -(i  -  i)  in  (11)  respectively,  we 
can  solve  for  V,'*’  and  Vf  in  rtoaed  form  as  [8): 

V+  =  (X  -  Y(S")-^S-*-)(S++ -  S+-(S")-^S'+lTi) 
V-  =  (Y-X(S++)-^S+-)(S— -S-*-(S+'*-)-^S+-jli) 
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where  we  have  defined  the  M  x  M  matrices 
[V*],v„.v,  =  [V-l.v,..v.  =  vr  ;V,)  (14) 

[S*],v,„v3  =  5(±(i  -  :^).-Vi:±(i  -  ^’2)  (15) 

X.v,,.v3  =  Ml  -  -  i-.Vj)  -  s(i.A'i;!,.V2) 

+  A,s(k.V,:)-^,.'Vj)  (16) 

Yv,..V3  =  Ml  -  -(!  - -  5(!,-Y,;-!'.A2) 

+  Ass(j,  jVi;  — (i  —  — ),  jVj)  (17) 

p  Summary  of  Overall  Procedure 

The  overall  procedure  can  be  summarized  as  follows.  Let 
[  be  the  largest  radius  (maximum  radial  prediction  or¬ 
der).  Then  for  all  1  <  Nj  <  M: 

1.  Initialization 

Compute  /i(±^,  Afi;  0,  fVj),  h{±].,  Ni\0,  N^)  using  (2). 
Compute  3(±^,  A'l;  j,  iVj),  s(-£\,  Nf)  using  (10) 
for  all  j  =  ±1, . . . ,  ±27  max- 

2.  Propagation  of  Split  Schur-Like  Algorithm 

A  Computate  the  potentials  V',‘*’(Af,,  Nj)  and  ^“(Ni.TVj) 
using  (12)  and  (13); 

B  Update  the  Schur  variables  using  (11)  for  j  =  ±(i± 

|) . ±2I„ai- 

3.  Propagation  of  Split  Levinson-Like  Recurrence 

A.  Propagate  the  Boundary  Points: 

M«±^.A'.;t-i  N2)  =  M«.A\;*-LA2)-V.+  (N„N2) 

(18) 

A(i±^,  A'l;  -«±^,  A'j)  =  h(t,  Ni\ -i±l,  Aj)-V',“(Ai,  Afj) 

(19) 

B.  Propagate  Non-Boundary  Points: 

Update  h{t.  A'l;;,  A'j)  using  equation  (7)  for  j  =  -(i  - 
1)  to  j  =  (:  -  \}. 

4.  Repeat  steps  2  and  3  from  i  =  1  to  /„,ar  with  increment 

1 

2' 

Note  that  the  above  generalized  Levinson  and  Schur  re¬ 
currences  (7)  and  (11)  are  highly  parallel,  and  perform  the 
same  ‘ype  of  in-place  computation.  This  allows  a  highly 
parallel  and  pipelined  architecture  to  be  developed  for  this 
algorithm. 

in  DISCUSSION 

A.  Isotropic  Random  Field 

For  an  isotropic  random  field,  the  covariance  is  a  function 
of  distance  only,  i.e.,  if  x  and  y  are  two  arbitrary  points  in  the 
plane,  then  K{x,y)  =  K(\x  -  y|).  Consider  the  special  case 


of  a  isotropic  random  field  with  covariance  K(x,  y)  = 
which  is  often  used  in  image  modeling.  In  polar  coordinates 
on  a  discrete  polar  raster,  and  if  p  «  1,  this  covariance  func¬ 
tion  can  be  represented  as 

K{t,NuJ,Nj)  =  /+2’-20c<..(2MV.-At2)/M) 

-  3,il(-t-rP+(-jPl-((t>2)’-('-j)’l«*(J>'(v.-/v,)/Af)  a-  1  u.  i 

P  1^2 

([(‘±j)^±(i-j)’]-t(«±i)’-(»-j)>^(2tr(7V.-,V2)/-Y)lnp(20) 

Note  that  the  exponent  has  the  Toeplitz-plus-Hankel  struc¬ 
ture  required  by  (5)  and  (6),  and  that  it  is  not  merely  block- 
Toeplitz;  hence  the  multichannel  Levinson  algorithm  is  not 
applicable.  If  p  as  1,  the  entire  covariance  satisfies  (5)  and 
(6).  Indeed,  any  slowly-changing  function  of  distance  satis¬ 
fies  (5)  and  (6). 

B.  Computational  Complexity 

We  determine  the  number  of  multiplications/divisions 
(MADs)  needed  to  solve  (4)  up  to  order  t  =  /mar-  The  ini¬ 
tialization  of  the  Levinson-like  recurrences  requires  2  M  x  M 
matrix  inversions  and  4  M  x  M  matrix  multiplications,  or 
2(^  ±  ■^)  ±  4M^  MADs.  The  initialization  of  the  Schur- 
like  recurrences  requires  8/mai  MxM  matrix  multiplications 
,  or  8/morA/^  MADs.  Elach  Schur-like  recursion  update  of 
3(»,7Vi;  j.fVj)  from  i  to  i  ±  ^  requires  16(/m„  —  i)M^  MADs. 
Computation  of  the  potentials  requires  4  M  x  M  matrix 
inversions  and  S  M  x  M  matrix  multiplications.  Finally, 
updating  h{i,Ni;j,Ni)  from  i  to  «  +  ^  in  the  Levinson-like 
recurrence  requires  4(2t  ±  l)Af’  MADs.  The  total  number 
of  multiplications  needed  to  solve  (2)  up  to  i  =  Inaz  is  equal 
to  [8] 

24I^„M»  ±  l™x(^  +  4M»)  ±  +  M^)  =  0(IL.M») 

(21) 

For  large  /mo*,  this  is  much  les"  than  the  number  of 
MADs  required  for  the  solution  of  (4)  by  Gaussian  elimina¬ 
tion,  which  would  require 

multiplications.  In  euldition,  as  shown  in  the  above  proce¬ 
dures,  this  procedure  is  highly  pe^allelizable.  Therefore,  the 
overall  reduction  in  time  complexity  would  be  even  more  sig¬ 
nificant  using  vector/parallel  processors. 

C.  Relations  with  Continuous  Algorithms 

It  is  instructive  to  examine  the  continuous-parameter  lim¬ 
its  of  some  of  the  equations  of  this  paper.  Let  the  intervals 
between  points  be  in  the  radial  direction  and  6$  =  ^  radi¬ 
ans  in  the  transverse  direction.  Introducing  a  radial  weight¬ 
ing  factor,  and  taking  limits  as  6r  and  6t  go  to  zero  result  in 
the  following  transformations: 

1.  The  discrete  Wiener- Hopf  equation  (2)  beconjes  the 
Wiener-Hopf  integral  equation; 

2.  becomes  a  continuous  two-dimensional  impulse 
function,  dominating  the  other  terms  in  the  defini¬ 
tion  (10)  of  the  Schur  variables.  The  recursion  (11) 
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aow  propagates  the  non-impulsive  part  of  the  Schur 
variables,  so  that  (12)  and  (13)  may  be  replaced  with 
V'*'  w  X  and  V“  w  Y.  Compare  this  to 


Numerical  studies  have  shown  that  approximation  (2)  ^ 

give  very  good  results  for  6,  ss  0.001,  but  discretization  is 
much  more  sensitive  to  non-infinitesimal  6,. 


O  O 

V{xJu02)  =  -(^  +  =  XyOi)  (22) 

where  i  and  y  are  continuous  radii  and  6i  and  02  are 
continuous  angles.  Equation  (22)  has  the  form  of  (4- 
17b)  of  [7].  Similarly,  the  continuous  version  of  (13)  has 
the  form  of  (4-2)  of  [7].  Equation  (7),  with  its  difference 
of  discrete  two-dimensional  Laplacian  operators  on  the 
left  side,  is  clearly  analogous  to  (Az  =  Laplacian  with 
respect  to  i) 


(Az-A,)h(i,fl,;y,tfi)  =  V(xA\03}hM-,y,02)  d03 

Jo 

(23) 

which  is  the  two-dimensional  form  of  (4-1)  of  [7j.  How¬ 
ever,  (23)  is  NOT  the  continuous  limit  of  (15)  with 
radial  weighting,  since  ■;^£r(\/?/(i))  =  (^  +  - 


^)/(i),  which  is  not  the  radial  part  of  the  2-D  Lapla¬ 
cian.  On  theothei  hand,  ;^(x/(z))  =  (^+zS)/(^^). 
which  is  the  radial  ptirt  of  the  3-D  Laplacian.  This 
shows  that  the  results  of  [7],  derived  for  the  continu¬ 
ous  3-D  case,  do  not  apply  exactly  to  the  2-D  case  (as 
do  the  results  of  this  paper); 


3.  The  algorithms  of  this  paper  require  the  differences  of 
the  radial  parts  and  transverse  parts  of  the  Laplacian 
of  the  covariance  to  be  separately  zero:  (5)  and  (6) 
must  be  separately  zero.  However,  in  the  continuous 
limit,  we  have  h{i,Ni;n,  N3)  as  h(«-^,fV,;n,  fVs);  then 
it  suffices  for  the  sum  (A,  -h  A«)A’(i,fVi; j,  fVs)  =  0* 
rather  than  (5)  and  (6)  separately.  This  agrees  with  the 
requirement  (A^  —  Ay)K(x,y)  =  0  for  the  algorithms 
in  [7]. 

We  can  draw  some  important  conclusions  from  these 
observations.  If  the  algorithms  of  this  paper  are  being 
used  to  solve  a  discretized  Wiener-Hopf  equation,  then 

(a)  In  (12)  and  (13)  the  impulses  lead  to  diagonally- 
dominant  systems,  so  that  (12)  and  (13)  may  be 
^epl^u;ed  with  the  approximations  V'*'  ss  X  and 
V“  as  Y  .  Therefore,  the  overall  complexity  will 
be  furtherly  reduced  by  avoiding  the  matrix  in¬ 
versions  in  (12)  and  (13); 

(b)  By  the  chain  rule,  any  continuous  function  of  the 
distance  between  two  points  will  satisfy  (5)  and 
(6),  since  the  square  of  the  distance  itself  does. 
Hence.the  algorithms  may  be  used  for  any  isotropic 
rsmdom  field.  Note  in  particular  that  (20)  be¬ 
comes 

K{i,  JV,;  j,  Ni)  =  (24) 

and  — ♦  1  as  6,  — »  0; 

(c)  Conditions  (5)  and  (6)  may  be  replsMed  with  the 
more  general  condition  (A,-bA#)7f(t,yVi;  Afa)  = 
0. 


TV  CONCLUSION 

New  fast  algorithms  for  solving  the  discrete  2-D  Wiener- 
Hopf  equation  on  a  polar  raster  when  the  covariance  function 
has  block  Toeplitz-plus-Hankel  structure  have  been  derived. 
Since  we  have  performed  explicitly  discrete  derivations,  in¬ 
stead  of  just  discretizing  the  continuous  versions,  the  algo¬ 
rithms  work  regardless  of  the  number  of  points  used.  If  ad¬ 
jacent  points  are  close  enough,  then  the  algorithm  would 
reduce  to  the  continuous  case  [7]. 

The  smoothing  filter  for  estimating  the  points  inside  the  . 
disk  can  be  computed  from  the  prediction  filters  using  a  gen¬ 
eralized  discrete  Bellman-Siegert-Krein  identity.  The  overall 
complexity  is  reduced  compared  with  Gaussian  elimination 

Unresolved  issues  mclude  mapping  of  this  algorithm  into 
optimal  array  processor  architectures,  the  numerical  stability 
of  the  algorithm,  and  practical  applications  of  this  algorithm 
in  problems  such  as  image  restoration  and  coding. 
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Fast  algorithms  for  computing  the  linear  least-squares  estimate  of  a  multi-dimensional 
random  field  from  noisy  observations  inside  a  circle  (2-D)  or  sphere  (3-D)  are  derived.  The 
double  Radon  transform  of  the  random  field  covariance  is  assumed  to  have  a  Toeplitz-plus- 
Hankel  structure;  this  is  equivalent  to  the  multi-dimensional  spatial  displacement  property 
(Ax  -  Ay)k{x,y)  =  0.  Note  that  this  only  reduces  the  number  of  degrees  of  freedom  by 
one;  homogeneous  and  isotropic  random  fields  are  included  as  special  cases.  The  algorithms 
exploit  this  structure  to  reduce  the  amount  of  computation  needed  to  solve  the  multi¬ 
dimensional  Wiener-Hopf  equation 

kix,y)  =  h(x,y)  +  J  h{x,z)k{z,y)dz,  ly)  <  |i|,  x,y,z  E 

The  algorithms  can  be  viewed  as  generalized  split  Levinson  and  Schur  algorithms,  since 
they  exploit  this  structure  in  the  same  way  that  their  one-dimensional  counterparts  exploit 
the  Toeplitz  structure  of  the  covariance  of  a  stationary  random  process.  The  algorithms 
are  easily  parallelizable,  and  they  are  recursive  in  increasing  radius  of  the  hypersphere  of 
observations.  They  have  the  form 

(Ar  -  Ay)h{x,y)  =  J  V{x,e)h{\x\€,y)de,  ||e||  =  1,  x,y  E 

where  V{x,e)  characterizes  the  filters  h{z,y)  for  |y|  <  |z|  <  [ij  much  as  the  reflection  coeffi¬ 
cients  characterize  the  1-D  prediction  filters  of  all  orders.  The  discrete  forms  of  the  problem 
and  the  algorithm  are  shown  to  be  simply  the  obvious  discretizations  of  the  equations  given 
here. 

It  is  important  to  note  that  these  algorithms  do  NOT  assume  quarter-plane  or  asym¬ 
metric  half-plane  support  for  the  filter,  as  do  previous  “2-D’’  Levinson  algorithms  that 
are  really  multichannel  1-D  algorithms.  The  new  algorithms  are  true  multi-dimensional 
algorithm?  that  do  not  attempt  to  reduce  dimension2dity,  but  only  take  advantage  of  an 
assumed  structure  of  the  covariance  function. 

An  earlier  version  of  this  work  was  presented  at  the  ICASSP  in  New  York.  The  new 
material  presented  here  includes: 

1.  The  discrete  form  of  the  problem,  and  the  discrete  algorithm  solving  it; 

2.  Numerical  results  on  the  performance  of  the  algorithm; 

3.  A  procedure  for  estimating  a  covariance  of  the  desired  form  from  a  sample  function  of 

a  random  field  (i.e.,  a  multi-dimensional  “Toeplitzation  plus  Hankelization”) 
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